-
Notifications
You must be signed in to change notification settings - Fork 14
Add TY-24 state file intake archiver service #6017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TY-24 state file intake archiver service #6017
Conversation
Heroku app: https://gyr-review-app-6017-79ac9d54b8fc.herokuapp.com/ |
…3-archiver-service-for-2024
…3-archiver-service-for-2024
…3-archiver-service-for-2024
# we batch these since archiving involves copying the submission pdf to a new location in s3 | ||
StateFile::Ty24ArchiverService.archive!( | ||
state_code: args[:state_code], | ||
batch_size: 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last year the batch number was 10, it seemed a little stingy to me so I increased it but happy to go down again
|
||
et = Time.find_zone('America/New_York') | ||
start_date = et.parse('2025-01-15 00:00:00') # state_file_start_of_open_intake | ||
end_date = et.parse('2025-10-25 23:59:59') # state_file_end_of_in_progress_intakes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how many times will this be run? i’m wondering if these dates might need to be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, because of the query for the already archived intakes, this doesn’t have to change. 👍
# remove intakes that have already been archived | ||
archived_emails = StateFileArchivedIntake.where(state_code: state_code, tax_year: tax_year) | ||
.pluck(:email_address) | ||
archived_phones = StateFileArchivedIntake.where(state_code: state_code, tax_year: tax_year) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why query by the email and phone as opposed to the hashed ssn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess they are both unique so it doesn’t really matter but i was just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is unique identifier that they use to sign in with, they either pick email or phone number. Last year they only did this check on emails since we didn't have sms working in the previous season before that. But actually I was thinking about it more and I don't know why we didn't just add a key reference for that state file intake like data_source_id
instead wdyt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 ah, i think this what you have is OK. considering that this will eventually exist in another db the data_source_id
would go away.
) | ||
archived.save! | ||
|
||
if intake.submission_pdf.attached? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would we archive intakes where there is no submission pdf?
🤔 feels like a weird state to be in considering the query checks if the submission has been accepted.
could we just query to guarantee that the submission PDF is present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question, I assumed that there might be weird cases where they have an accepted return but the submission pdf itself got deleted for some reason. But any case this is just following the structure from last year. @mpidcock martha, as someone who worked on the service last year-- do you know why we would want to archive intakes where there are no submission pdfs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my best recollection about this: there really shouldn't ever be a case where there's no submission pdf, but my assumption was that if we did fail to retrieve it, then that was a system error, and we would need to investigate on a case-by-case basis about what happened. I didn't want to simply skip over any submissions with no pdf, because we still know they should have a pdf, and I wanted to be able to flag anyone in that state. We didn't have an issue though, they all had pdfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so the plan would have been to note all the ones that didn't have pdfs via the rails logger warning and then investigate why there wasn't a submission pdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess another thing I could do is just not create an archived intake if they don't have a submission pdf and just print the logger warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think that's ok, we were in a rush to delete the old records before so I was prioritizing archiving anything we might need so that the og records could be safely cleared, but we don't have that concern this year, so skipping + logging should be fine!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⭐
great work em! thank you ❤️ |
…3-archiver-service-for-2024
Link to pivotal/JIRA issue
Is PM acceptance required? (delete one)
Reminder: merge main into this branch and get green tests before merging to main
What was done?
How to test?
bundle exec rake state_file:ty24:archive_#{state_code}