Tap Forms – Organizer Database App for Mac, iPhone, and iPad › Forums › Script Talk › Using scripts to fill in data and categorize from external files?
- This topic has 21 replies, 3 voices, and was last updated 3 years, 11 months ago by Brent S.
-
AuthorPosts
-
November 9, 2020 at 2:55 PM #42535
Brent SParticipantHi Everyone. I am new to scripting, so want to see if this is possible. Imagine the following situation. You have 50,000 food recipes all in PDF form input into the Tap Forms Database, each PDF with a unique file name and in a field where it is a “File Attachment”. So, there are then 50,000 records in the db. Many fields have been added to the database, but are empty until you manually go in and populate them. Is it possible to use scripts to help populate the data?
For example, lets say outside TapForms you use Spotlight on the iMac to identify 60 PDF’s (recipes) that contained the words “Gummy Bear”. You make an Excel File (.csv) with those 60 unique file names. Can you now write a script that would reference those file names in a .csv file, find/look up the specific records in tapforms where those files are attached and then do two things:
a) choose a picklist field that already exists, and choose one of the presets. For example, if this was a list of meal choices like “Breakfast, Lunch, Dinner, Dessert” the script would make it choose “Dessert” for each of those 60 records, but not any other records in the database
b) If you had a textfile field named KEYWORDS then have it add the term “Gummy Bear” to the list of keywords again just for those 60 records
If this is possible then I can see how scripting could help me and others categorize a very large database that would take a very long time to categorize manually. Any direction on what a script might look like would be extremely appreciated.
Thanks,
BrentNovember 9, 2020 at 8:27 PM #42537
Daniel LeuParticipantYou can import a csv file into a separate form and then use a script to process all records and update your reference records accordingly. Then you can assign values of meal choices accordingly and update your keywords field as well.
But this seems to be a rather tedious job if you want to update all 50k food recipes in your Tap Forms database this way. I would start with looking into software that converts your PDFs into text files. Then you can add the content of your text files into fields inside Tap Forms. These fields are searchable, which makes it much easier to categorize your recipes. But yeah, it might be a challenge to convert your PDFs, specially if they were scanned and the quality is poor.
But if you want to go the CSV route, I would create a spreadsheet with columns for
PDF
,Category
, andKeyword
. Here is my example:function addKeyword(rec, kw_id, kw){ var keywords = rec.getFieldValue(kw_id); if (keywords == undefined || keywords == ""){ rec.setFieldValue(kw_id, kw); } else { rec.setFieldValue(kw_id, keywords + ',' + kw); } } function Update_Records() { var cvsForm = document.getFormNamed("csv"); var recipesForm = document.getFormNamed("Recipes"); // recipes var category_id = 'fld-9828d73d445a4746a8dba4c0dac89716'; var keyword_id = 'fld-073ee12d29174c2eb64f7cf91f5bb383'; var pdf_id = 'fld-e870153bd0a647c9823ca998c213d1fd'; // get csv field Ids var csvPdf_id = cvsForm.getFieldNamed('PDF').getId(); var csvCategory_id = cvsForm.getFieldNamed('Category').getId(); var csvKeyword_id = cvsForm.getFieldNamed('Keyword').getId(); // Loop over all csv entries for (entry of cvsForm.getRecords()){ // get pdf file name var pdfName = entry.getFieldValue(csvPdf_id); // replace spaces with underscores pdfName = pdfName.replace('/ /g', '_'); console.log("Processing: " + pdfName); // Loop over all recipes records for (recipe of recipesForm.getRecords()){ // this assumes that there is only one PDF per recipe! var filename = recipe.getFieldValue(pdf_id)[0].filename; if (pdfName == filename){ console.log("Found match: " + filename); // Update recipe record addKeyword(recipe, keyword_id, entry.getFieldValue(csvKeyword_id)); // Set category recipe.setFieldValue(category_id, entry.getFieldValue(csvCategory_id)) break; } } document.saveAllChanges(); } return 'done'; } Update_Records();
The first row of my spreadsheet are the field labels:
PDF
,Category
,Keyword
. They are used to create the field names when creating a new form calledcsv
upon importing your csv. The script is part of yourRecipes
form.This script assumes that the keywords are comma separated. Keywords are added to your record if new ones are found. The category field is overwritten!
In order for the script to work, you need to update
category_id
,keyword_id
, andpdf_id
according to your recipes form.Hope this helps! Happy cooking!
November 9, 2020 at 9:56 PM #42539
Brent SParticipantThank you Daniel! This is great. I am glad to know this is possible. It is going to take me a little bit of time to figure out scripting, but you have given me a great start.
My database is actually not a recipe database. That was just a simple example to illustrate the problem I have which is a lot more complex. Your answer will help me generate a solution.
The documents (readable PDF’s) I am dealing with are unstructured contracts and each are about 30 pages long, and yes each has its own unique file name. I do have them all in HTML too, but I think that might be a bit much to have that much text in Tap Forms?
So, just circling back to the one part of the solution, where I have a pick list with four choices, in my example, “Breakfast, Lunch, Dinner, Dessert” your code is as follows:
// Set category
recipe.setFieldValue(category_id, entry.getFieldValue(csvCategory_id))If I understand you correctly, I would put the word “Dessert” as one of the entries in the CSV file under “Category” and then this will overwrite it in the Pick List?
Anyhow. Thanks again!
November 10, 2020 at 12:48 AM #42540
Daniel LeuParticipant// Set category
recipe.setFieldValue(category_id, entry.getFieldValue(csvCategory_id))If I understand you correctly, I would put the word “Dessert” as one of the entries in the CSV file under “Category” and then this will overwrite it in the Pick List?
Yes, but it would not overwrite the pick list, but just the value assigned to the field.
Pick list is just a list of predefined values for a field. You still can enter a different value to that field.
November 14, 2020 at 5:21 PM #42586
Brent SParticipantHi Daniel,
A few more questions if I may. I have tried to figure things out by watching some of Sam’s videos etc., but I am a little stuck.
So first, from my reading it looks like I need to use a “Form Level Script” as opposed to a “Field Level Script” as I am updating existing records.
I only have two forms, my main database and the cvs data I want to input. Is it correct that I must place the Form Level Script in my main database and not in the cvs form? (That seems to be the only way I can get the correct fld-hash numbers that I need into the script).
Next, much of the script makes sense to me, and other parts seem somewhat circular. Could you look what I have done and see where I messed up? …because I think I did a good job on that : )
I need the script to look up each record that aligns with the PDF file name in the cvs file, and then update the records in the main database form called “Contracts”
I have attached a few screen shots in case that helps with context.
Thanks again.
Brent——————-
function addKeyword(rec, kw_id, kw) {
var keywords = rec.getFieldValue(kw_id);
if (keywords == undefined || keywords == “”){
rec.setFieldValue(kw_id, kw);
} else {
rec.setFieldValue(kw_id, keywords + ‘,’ + kw);
}
}function Update_Records() {
var csvForm = document.getFormNamed(“csv”);
var ContractsForm = document.getFormNamed(“Contracts”);// Contracts
var pdffilepath_7_id = ‘fld-3ea77ccc18604174b722ece105965c44’;
var license_category_id = ‘fld-d63ab1856e554efebe62b9f1826c388d’;
var license_subcategory_id = ‘fld-f0c47c7e6ad244fe955e2c24cb448590’;
var agreement_keywords_id = ‘fld-a2126cbe512646e9ba144fb5a90b06dc’;// get csv field Ids
var pdffilepath_7 = cvsForm.getFieldNamed(‘PDF’).getId();
var license_category = cvsForm.getFieldNamed(‘Category’).getId(); var license_subcategory = cvsForm.getFieldNamed(‘Subcategory’).getId();
var agreement_keywords = cvsForm.getFieldNamed(‘Keyword’).getId();// Loop over all csv entries
// So this tells my Contracts database to look at the cvs form?
for (entry of cvsForm.getRecords()){// get pdf file name
// is it looking for another fld-hash here?
var pdfName = entry.getFieldValue(csvPdf_id);// replace spaces with underscores
pdfName = pdfName.replace(‘/ /g’, ‘_’);
console.log(“Processing: ” + pdfName);// Loop over all LICENSEdb records
for (Contracts of Contracts.getRecords()){// this assumes that there is only one PDF per Contract
// where have I defined (pdf_id) before? Do I need to ?
var filename = Contract.getFieldValue(pdf_id)[0].filename;if (pdfName == filename){
console.log(“Found match: ” + filename);// Update Contracts record
addsubcategory (Contracts, subcategory_id, entry.getFieldValue(csvKeyword_id));// Update Contracts record
addKeyword(Contracts, keyword_id, entry.getFieldValue(csvSubcategory_id));// Set category
Contracts .setFieldValue(category_id, entry.getFieldValue(csvCategory_id))break;
}}
document.saveAllChanges();return ‘done’;
}Update_Records();
Attachments:
You must be logged in to view attached files.November 15, 2020 at 11:27 AM #42602
Sam MoffattParticipantA few more questions if I may. I have tried to figure things out by watching some of Sam’s videos etc., but I am a little stuck.
Curious for feedback, were any of them helpful? What was not helpful and what would improve it? What was helpful and how was it helpful? I had thought about doing another simple intro one on the next few snippets which includes functions but haven’t done it. I tried another video on some other topics but wasn’t happy with how it turned out.
So first, from my reading it looks like I need to use a “Form Level Script” as opposed to a “Field Level Script” as I am updating existing records.
I think my rule of thumb is that field scripts are meant to act on the contents of the record as those contents change whilst form scripts are more one off or bulk action changes. You can use them somewhat interchangeably and the question is do you want this thing to run all of the time or is it something you want to control when it runs?
I only have two forms, my main database and the cvs data I want to input. Is it correct that I must place the Form Level Script in my main database and not in the cvs form? (That seems to be the only way I can get the correct fld-hash numbers that I need into the script).
While it might not be intuitive you can access any form and any field from scripts (fields or forms) in any form within a document. The editor won’t show you the fields that aren’t directly linked to the form that you’re in but you can use the editor to go to other forms and get their field ID’s. You can also get the field ID from below the description box in the field editor UI.
You can use the code button to wrap your code with backticks (e.g. `code here`) to make it a little easier to read (also ran it through a formatter since the post didn’t have any indentation):
function addKeyword(rec, kw_id, kw) { var keywords = rec.getFieldValue(kw_id); if (keywords == undefined || keywords == "") { rec.setFieldValue(kw_id, kw); } else { rec.setFieldValue(kw_id, keywords + ',' + kw); } } function Update_Records() { var csvForm = document.getFormNamed("csv"); var ContractsForm = document.getFormNamed("Contracts"); // Contracts var pdffilepath_7_id = 'fld-3ea77ccc18604174b722ece105965c44'; var license_category_id = 'fld-d63ab1856e554efebe62b9f1826c388d'; var license_subcategory_id = 'fld-f0c47c7e6ad244fe955e2c24cb448590'; var agreement_keywords_id = 'fld-a2126cbe512646e9ba144fb5a90b06dc'; // get csv field Ids var pdffilepath_7 = cvsForm.getFieldNamed('PDF').getId(); var license_category = cvsForm.getFieldNamed('Category').getId(); var license_subcategory = cvsForm.getFieldNamed('Subcategory').getId(); var agreement_keywords = cvsForm.getFieldNamed('Keyword').getId(); // Loop over all csv entries // So this tells my Contracts database to look at the cvs form? for (entry of cvsForm.getRecords()) { // get pdf file name // is it looking for another fld-hash here? var pdfName = entry.getFieldValue(csvPdf_id); // replace spaces with underscores pdfName = pdfName.replace('/ /g', '_'); console.log("Processing: " + pdfName); // Loop over all LICENSEdb records for (Contracts of Contracts.getRecords()) { // this assumes that there is only one PDF per Contract // where have I defined (pdf_id) before? Do I need to ? var filename = Contract.getFieldValue(pdf_id)[0].filename; if (pdfName == filename) { console.log("Found match: " + filename); // Update Contracts record addsubcategory(Contracts, subcategory_id, entry.getFieldValue(csvKeyword_id)); // Update Contracts record addKeyword(Contracts, keyword_id, entry.getFieldValue(csvSubcategory_id)); // Set category Contracts.setFieldValue(category_id, entry.getFieldValue(csvCategory_id)) break; } } } document.saveAllChanges(); return 'done'; } Update_Records();
First thing I noticed when I put it through a formatter was that there seemed to be a missing bracket, I’ve added an extra one before the
document.saveAllChanges()
line and everything balanced out. I used Visual Studio Code to reformat it but I also found an online JavaScript beautifier as well.Next, much of the script makes sense to me, and other parts seem somewhat circular. Could you look what I have done and see where I messed up? …because I think I did a good job on that : )
I need the script to look up each record that aligns with the PDF file name in the cvs file, and then update the records in the main database form called “Contracts”
Now I see you added an extra call:
addsubcategory (Contracts, subcategory_id, entry.getFieldValue(csvKeyword_id));
What this change is doing is trying to call a function here called
addsubcategory
but that function doesn’t exist. I think you’re trying to copy theaddKeyword
line for a new field and if you look at the very top of the script there is afunction addKeyword
which defines or creates the function. If what you’re trying to do is create a comma separated list of sub categories but in a different field, then I don’t think you need to do a new function, I think you can just calladdKeyword
with the sub category field ID:addKeyword (Contracts, subcategory_id, entry.getFieldValue(csvKeyword_id));
Try fixing the curly braces and changing
addsubcategory
over toaddkeyword
as above and see if that gets you a little closer.November 15, 2020 at 7:02 PM #42612
Brent SParticipantHi Sam,
Thank you for the help and feedback. I am getting close to getting it to work, but still a bit shaky in terms of my understanding of things. But first things first, let me answer your questions:
Curious for feedback, were any of them helpful?
Yes. Absolutely! Thank you for making them. The original “Creating a Licensing Database” was great. One of the other videos on Scripting (Deep Dive) helped me make sense of the fld-hash codes, as I had no idea how Daniel had come up with that until I watched your video.
What was not helpful and what would improve it?
I like when videos can compress content in 5-10 minutes. Better to break up videos into smaller digestible chunks and not to try to cover too many topics (IMHO)What was helpful and how was it helpful?
I really like how you went through the Javascript Code line by line to explain how the computer is interpreting the code.
I had thought about doing another simple intro one on the next few snippets which includes functions but haven’t done it. I tried another video on some other topics but wasn’t happy with how it turned out.
This would be very helpful. For me, an example where you had two very simple forms and you wanted to use a particular Snippet like “Set Field Value”. Ideally the video would explain what you are trying to do, why you are trying to do it, write the code and run the Snippet, show the effect that it had. Ideally the length of the video would be approximate 5 min. For a beginner like me I need to see a very basic version of something working so that I can alter it for my own use. Once I get the simple version working, hopefully then I can build on that for things that are more complex.
There is your feedback : ). And thanks again for making the videos.
Now, a small request for additional help…..
I have not been able to debug the script that I have to get it to work. I am wondering if you can help me vastly simplify the script just to doing one or two things instead of about 4 things at the same time. Basically I would like to start over. So, as I said, I only have two forms. One is my main database called “Contracts” which has thousands of unique PDF’s attached in a file attachment field. The second form (which I named csv) is a sub-list of those PDFs (about 1200 of 36000) that I want to tag with the the word “vaccine”.
Therefore, I need a script that will take my small list of 1200 pdfs, find the corresponding/matching file in the larger Contracts database, and insert an entry “vaccine” into the field “Agreement Keywords”. Seems like this should be simple.
If I can just accomplish that then I can change the code later to take action in other fields.
Key information
The location of the field I need to access in the contracts database:
var pdffilepath_7_id = ‘fld-3ea77ccc18604174b722ece105965c44’;
var agreement_keywords_id = ‘fld-a2126cbe512646e9ba144fb5a90b06dc’;The location of the data in the csv database:
var keyword_id = ‘fld-53de1c4c633a4da6a4d0213158b5ef0a’;
var pdf_id = ‘fld-0cbd22a473cd4c58aa1dc47341a7157b’;I think this ability to have scripts auto-populate fields is critically important to anyone like me that has a massive database of documents that are unstructured and need to be tagged and categorized within Tapforms.
Thanks,
BrentNovember 16, 2020 at 11:57 AM #42618
Daniel LeuParticipantIn hindsight, it would have been better if you had described your problem as is and not made up this hypothetical recipe case….
As you might have noticed, all CSV form related fields use the a
csv
prefix to make it clear where this field resided. And yes, my script is a from script that is part of the contracts form. My assumption was that the CSV form is deleted after use and recreated with a new tagging set. So it makes sense that the script has a more permanent location.Here is the updated script. Hope this works.
function addKeyword(rec, kw_id, kw){ var keywords = rec.getFieldValue(kw_id); if (keywords == undefined || keywords == ""){ rec.setFieldValue(kw_id, kw); } else { rec.setFieldValue(kw_id, keywords + ',' + kw); } } function Update_Records() { var csvForm = document.getFormNamed("csv"); var contractsForm = document.getFormNamed("Contracts"); // recipes var keyword_id = 'fld-a2126cbe512646e9ba144fb5a90b06dc'; var pdf_id = 'fld-3ea77ccc18604174b722ece105965c44'; // get csv field Ids // var csvPdf_id = csvForm.getFieldNamed('PDF').getId(); // var csvKeyword_id = cvsForm.getFieldNamed('Keyword').getId(); // get use fixed field Ids!!!!!!!!!!!! var csvPdf_id = 'fld-0cbd22a473cd4c58aa1dc47341a7157b'; var csvKeyword_id = 'fld-53de1c4c633a4da6a4d0213158b5ef0a'; // Loop over all csv entries for (entry of cvsForm.getRecords()){ // get pdf file name of CVS record var cvsPdfName = entry.getFieldValue(csvPdf_id); // replace spaces with underscores cvsPdfName = cvsPdfName.replace('/ /g', '_'); console.log("Processing: " + cvsPdfName); // Loop over all contract records for (contract of contractsForm.getRecords()){ // this assumes that there is only one PDF per recipe! var pdfName = contract.getFieldValue(pdf_id)[0].filename; if (cvsPdfName == pdfName){ console.log("Found match: " + pdfName); // Update contract record addKeyword(contract, keyword_id, entry.getFieldValue(csvKeyword_id)); break; } } document.saveAllChanges(); } return 'done'; } Update_Records();
November 16, 2020 at 3:04 PM #42621
Brent SParticipantHi Daniel,
You are 100% correct. I realized once I saw the complexity of the answer that a hypothetical example was not the way to go. Noob mistake.
The code is almost working. Its at least now starting to process files, but then encounters the following error. (See attached).
I went through the code and replaced some entries that had “cvs” instead of “csv” and replaced the word “contract” with “Contracts” just to be consistent (not sure if the last change mattered or had any effect), but still have the same error.
Thanks again. I hope this helps many people on the forum in addition to myself.
Cheers,
BrentAttachments:
You must be logged in to view attached files.November 16, 2020 at 5:33 PM #42623
Daniel LeuParticipantI thought I caught all
cvs
typos…There is a loop that goes over all contracts. So
contract
points to one entry. It makes more sense to use the singular form in this case.The error points to
var pdffilepath_7_id = ‘fld-3ea77ccc18604174b722ece105965c44’;
It was my understanding that this would be the PDF file attachment field.Could you share your form template? (file -> export -> form template). Thanks.
November 16, 2020 at 8:06 PM #42624
Brent SParticipantHi Daniel,
I am not comfortable posting my entire db structure online. I would send it privately.
The location of the PDF files in the main contracts database is indeed:
var pdffilepath_7_id = ‘fld-3ea77ccc18604174b722ece105965c44’;I just checked it again.
There are 35,747 entries (PDF’s), so it is only one field, but there are many entries.
I can share the csv form publicly with you if that helps to see the other end.
I have included a slice of the db that shows PDFFILEPATH 7 so you can indeed see that is where the different PDF Files are being held.
Thanks,
BrentAttachments:
You must be logged in to view attached files.November 16, 2020 at 8:35 PM #42627
Sam MoffattParticipantI’d also to check that there is a file attachment and skip ones that don’t:
// Loop over all contract records for (contract of contractsForm.getRecords()){ // this assumes that there is only one PDF per recipe! var pdfName = contract.getFieldValue(pdf_id)[0].filename;
to:
// Loop over all contract records for (contract of contractsForm.getRecords()){ if (!contract.getFieldValue(pdf_id) || contract.getFieldValue(pdf_id).length == 0) { console.log('Record is missing PDF: ' + contract.getUrl()); continue; } // this assumes that there is only one PDF per recipe! var pdfName = contract.getFieldValue(pdf_id)[0].filename;
That should check there is an attachment and validate it’s not set plus log a message with a clickable link in the console. If that ends up being everything then that’s not great but it wouldn’t bomb part way through.
There are 35,747 entries (PDF’s), so it is only one field, but there are many entries.
Many records or a single record with multiple PDFs?
Re feedback: The last few videos hit half an hour and I need to work on getting the content back in the 10 minute range. I ended up throwing a bunch of stuff in one of the videos which when I went back should have probably ended up broken out into a few videos (ended up putting in chapter links for at least one of them). I’m trying to keep them as continuous takes to make sure I don’t miss any steps along the way though sometimes that leads to mild confusion as it’s also not scripted/practiced. Sometimes seeing something small and seemingly unimportant is the hook you need to understand. Thanks for the feedback!
November 16, 2020 at 10:32 PM #42629
Brent SParticipantHi Sam,
Thanks. Almost there. So close I can taste it. See error message attached.
I thought perhaps I am missing one line that starts with something like:
var csvPdfName = csv.getFieldValue(csvPdf_id)[0].filename;
However, I tried this and it did not work and just gave me another message “Category Script: ReferenceError: Can’t find variable: csv, line:(null)
With respect to your question, there are 35,747 records. Each has one PDF associated with it.
Cheers,
BrentAttachments:
You must be logged in to view attached files.November 17, 2020 at 1:28 AM #42632
Daniel LeuParticipantYour code is a bit messed up with two contract loops…
unction addKeyword(rec, kw_id, kw){ var keywords = rec.getFieldValue(kw_id); if (keywords == undefined || keywords == ""){ rec.setFieldValue(kw_id, kw); } else { rec.setFieldValue(kw_id, keywords + ',' + kw); } } function Update_Records() { var csvForm = document.getFormNamed("csv"); var contractsForm = document.getFormNamed("Contracts"); // recipes var keyword_id = 'fld-a2126cbe512646e9ba144fb5a90b06dc'; var pdf_id = 'fld-3ea77ccc18604174b722ece105965c44'; // get csv field Ids // var csvPdf_id = csvForm.getFieldNamed('PDF').getId(); // var csvKeyword_id = csvForm.getFieldNamed('Keyword').getId(); // get use fixed field Ids!!!!!!!!!!!! var csvPdf_id = 'fld-0cbd22a473cd4c58aa1dc47341a7157b'; var csvKeyword_id = 'fld-53de1c4c633a4da6a4d0213158b5ef0a'; // Loop over all csv entries for (entry of csvForm.getRecords()){ // get pdf file name of CSV record var csvPdfName = entry.getFieldValue(csvPdf_id); // replace spaces with underscores csvPdfName = csvPdfName.replace('/ /g', '_'); console.log("Processing: " + csvPdfName); // Loop over all contract records for (contract of contractsForm.getRecords()){ if (!contract.getFieldValue(pdf_id) || contract.getFieldValue(pdf_id).length == 0) { console.log('Record is missing PDF: ' + contract.getUrl()); continue; } // this assumes that there is only one PDF per contract! var pdfName = contract.getFieldValue(pdf_id)[0].filename; if (csvPdfName == pdfName){ console.log("Found match: " + pdfName); // Update contract record addKeyword(contract, keyword_id, entry.getFieldValue(csvKeyword_id)); break; } } document.saveAllChanges(); } return 'done'; } Update_Records();
November 17, 2020 at 4:56 PM #42636
Brent SParticipantDaniel and Sam,
Thank you! Thank you! Thank you!
Success!!
Like Daniel mentioned in his original response, the process is a bit tedious. Indeed this is true, took my very fast computer (new iMac with 64G RAM) about 6 hours to complete this task. I was expecting about 6 minutes. So, I am going to need to find a more efficient way to do this. It might involve putting the full text of the PDF documents into a text field so that Tap Forms can just use its own search capability to pluck out keywords associated with a document within a record. (I also have full versions of all the contracts in HTML format which may be easier).
It may also be there is a more efficient way the code could be written because I think this went though all 35,747 files in the main database rather than looking at the approximately 1200 files I wanted to tag. This was because I noticed I was getting many “empty record” notices in the console when I know for a fact each entry in the database has a PDF file in it.
In any case, I am LOVING Tap Forms and looking forward to learning how to use these built in snippets along with customized scripts. Thank you Tap Forms community.
November 17, 2020 at 5:38 PM #42637
Daniel LeuParticipantBrent, great to hear that it worked, finally!
Is there a way you can select the 1200 file records with the advanced search? If that’s the case you can select the saved search and just run the script with a little change:
Before:
for (contract of contractsForm.getRecords()){
After:
for (contract of search.getRecords()){
It is strange to get the ’empty records’ notice although there should be a file associated with every record. That’s something I would look into.
Fetching the file record might be what takes a bit of time. If you run this script frequently, it might be worth a try to save the PDF filename in a field and then just use this field in the comparison.
Oh, don’t forget to backup your database, specially with so many entries!
I might add a little sanity check as well to verify that a record was really found:
var success = 0; // Loop over all contract records for (contract of contractsForm.getRecords()){ if (!contract.getFieldValue(pdf_id) || contract.getFieldValue(pdf_id).length == 0) { console.log('Record is missing PDF: ' + contract.getUrl()); continue; } // this assumes that there is only one PDF per contract! var pdfName = contract.getFieldValue(pdf_id)[0].filename; if (csvPdfName == pdfName){ console.log("Found match: " + pdfName); success = 1; // Update contract record addKeyword(contract, keyword_id, entry.getFieldValue(csvKeyword_id)); break; } } if (!success) { console.log("No match found: " + csvPdfName, "#FF0000"); }
November 18, 2020 at 1:09 AM #42640
Sam MoffattParticipantSome of the slow down is that you’re doing two loops and if you can control it a little more you might be able to create a unique key for one of the records and get rid of one of the loops. That would require making sure you imported your documents properly and maybe some sort of hash collision system. Obviously if you can reduce your candidate set to a search then that would reduce your loops down as well.
December 12, 2020 at 7:54 PM #42832
Brent SParticipantHi Sam and Daniel,
In order to get rid of one of the loops I could move everything to the script. That is, instead of having multiple columns with different data in my CSV file that need to be looked up and replaced,
-
I could reduce the CSV file to a single column that only has the PDF name
. That might solve a number of problems (not only speed, but the ’empty records’)
Then, all the script needs to do is a) find the matching record from the CSV form in the database that has the same PDF name, b) run the rest of the script to populate the fields in that record, and c) repeat for other records in the CSV file until finished.
I know that if I did it this way the end of the script should look something like this if there were two fields to populate.
function Update_Records() {
var csvForm = document.getFormNamed(“csv”);
var contractsForm = document.getFormNamed(“Contracts”);// specifies type of agreement
var type_of_agreement_id = ‘fld-6af499ed439143b297828643341d939b’
record.setFieldValue (type_of_agreement, ‘Licensing Agreement’)// This should “replace” keyword or keywords in a selected record
var agreement_keywords_id = ‘fld-a2126cbe512646e9ba144fb5a90b06dc’;
record.setFieldValue (agreement_keywords_id, ‘Vaccine, Animal, Bacterial’)Could you show me how you would take the original code we had above and simplify the beginning of the script and modify it so it works with the code I have here?
Thanks very much for any insight you can provide.
Cheers,
BrentDecember 12, 2020 at 9:44 PM #42833
Brent SParticipantThis is how I tried to fix, but still have the two loops:
function Update_Records() {
var csvForm = document.getFormNamed(“csv”);
var contractsForm = document.getFormNamed(“Contracts”);// Contracts
var keyword_id = ‘fld-a2126cbe512646e9ba144fb5a90b06dc’;
var pdf_id = ‘fld-3ea77ccc18604174b722ece105965c44’;// get use fixed field Id
var csvPdf_id = ‘fld-4bcd0f1fba5c4178bb8d10a112b17489’;// Loop over all csv entries
for (entry of csvForm.getRecords()){
// get pdf file name of CSV record
var csvPdfName = entry.getFieldValue(csvPdf_id);// replace spaces with underscores
csvPdfName = csvPdfName.replace(‘/ /g’, ‘_’);
console.log(“Processing: ” + csvPdfName);// Loop over all contract records
for (contract of contractsForm.getRecords()){
if (!contract.getFieldValue(pdf_id) || contract.getFieldValue(pdf_id).length == 0) {
console.log(‘Record is missing PDF: ‘ + contract.getUrl());
continue;
}// this assumes that there is only one PDF per contract!
var pdfName = contract.getFieldValue(pdf_id)[0].filename;if (csvPdfName == pdfName){
console.log(“Found match: ” + pdfName);// Update contract record
var type_of_agreement_id = ‘fld-6af499ed439143b297828643341d939b’
record.setFieldValue (type_of_agreement, ‘Licensing Agreement’)// This should “replace” keyword or keywords in a selected record
var agreement_keywords_id = ‘fld-a2126cbe512646e9ba144fb5a90b06dc’;
record.setFieldValue (agreement_keywords_id, ‘Vaccine, Animal, Bacterial’)break;
}}
document.saveAllChanges();}
return ‘done’;
}Update_Records();
December 13, 2020 at 8:54 PM #42841
Sam MoffattParticipantIf you were to aim to optimise the script, I’d probably want to have a map of PDF name to Tap Forms record ID. You can do a retrieve of a record by key/record ID via the script so if you precomputed a map of these keys or used the record ID import mode for Tap Forms to import the records then it’ll be a single retrieve for retrieving the record (
form_record_id
is the field in the CSV file which if present will be prepended with arec-
prefix and allow you to useform.getRecordWithId
). The horse has likely already bolted for this one but something to consider for future projects, so let’s go ahead with the other suggestion: PDF attachment name in a script field.If you have the PDF attachment name in a script field, you could use the
getRecordFromFormWithKey
function in my script manager. The method signature looks like this:getRecordFromFormWithKey(formName, keyFieldId, keyValue, createIfMissing = true, earlyTermination = true, alwaysScanOnMiss = false)
It creates in Javascript an index on a first pass and then does looks up from there (set
earlyTermination
tofalse
andalwaysScanOnMiss
totrue
). When you’re in a tight loop this should improve performance by doing the record loop once to build the index and then you can do a similar sort of roughly constant time retrieval for matching records. It’s still two loops but you remove the nesting and replace it with a more optimised data structure.First step is to create a new script field in your Contracts form and in it you want to add in the code we use to grab the filename from the PDF:
record.getFieldValue('fld-e870153bd0a647c9823ca998c213d1fd')[0].filename;
You’ll need to tick the “Update records when saving” option when creating that to populate the new field in all of your record. Essentially we’re just copying the filename as a new field. At this point you might also want to setup a search to look for records that have this field empty because that seems likely a bug. You’ll also need the field ID from the form panel (underneath the “Description” field) which I refer to later as “fld-scriptfieldidhere” that you’ll need to change.
Using the code you provided as a basis, something like this should get you some of the way there:
document.getFormNamed('Script Manager').runScriptNamed('getRecordFromFormWithKey'); function Update_Records() { var csvForm = document.getFormNamed("csv"); // Contracts var keyword_id = 'fld-a2126cbe512646e9ba144fb5a90b06dc'; var pdf_id = 'fld-3ea77ccc18604174b722ece105965c44'; // get use fixed field Id var csvPdf_id = 'fld-4bcd0f1fba5c4178bb8d10a112b17489'; var csvKeyword_id = cvsForm.getFieldNamed('Keyword').getId(); // Loop over all csv entries for (entry of csvForm.getRecords()) { // get pdf file name of CSV record var csvPdfName = entry.getFieldValue(csvPdf_id); // replace spaces with underscores csvPdfName = csvPdfName.replace('/ /g', '_'); console.log("Processing: " + csvPdfName); contract = getRecordFromFormWithKey("Contracts", "fld-scriptfieldidhere", csvPdfName, false, false, true); if (!contract) { console.log('Unable to find matching PDF record: ' + csvPdfName); continue; } console.log("Found match: " + csvPdfName); // Update contract record var type_of_agreement_id = 'fld-6af499ed439143b297828643341d939b' contract.setFieldValue(type_of_agreement, 'Licensing Agreement') // This should “replace” keyword or keywords in a selected record var agreement_keywords_id = 'fld-a2126cbe512646e9ba144fb5a90b06dc'; contract.setFieldValue(agreement_keywords_id, entry.getFieldValue(csvKeyword_id)); } document.saveAllChanges(); return 'done'; } Update_Records();
We’ve still got the outer loop that iterates over the incoming CSV form, the “Contracts” loop is removed and replaced with the call to
getRecordsFromFormWithKey
which internally does a loop for you but also builds an index in memory that should make subsequent accesses quicker. Watch your memory usage in Activity Monitor but you’ve got 64GB of RAM so we should be fine. The script is finding the record for you so the code that lived inside your inner for loop moves up a level though there is a check to see if we got a record back. I did some minor changes to set the keyword based on the CSV form as an example of how I think that to work.One thing I’ve done in scripts like this is put in a field for when it was processed to be able to skip it and also a counter to limit how many records are processed. This is useful for debugging scripts and being able to progressively execute it to get an idea of it:
let minTime = new Date().getTime() - 86400000; // one day let processed = 0; let skipped = 0; for (...) { let previousDate = entry.getFieldValue('fld-previousdateidhere'); if (previousDate && previousDate.getTime() > minTime) { skipped++; continue; } entry.setFieldValue('fld-previousdateidhere', new Date()); processed++; if (processed > 100) { break; } // your existing for loop logic here }
This uses an extra field and updates when the script is run to set it that way the next time you run the script, it skips past records it’s already “processed” so to speak and moves onto the next batch of 100. Instead of handling the date logic in code, you could also tie this into a saved search as well so that Tap Forms only gives you the candidate records.
December 14, 2020 at 1:07 PM #42847
Daniel LeuParticipantLoading the PDF attachments into memory should provide a nice performance boost. It would be interesting to see how the run times compares between the two versions.
December 18, 2020 at 9:34 AM #42898
Brent SParticipantThanks guys. Once I get it working I will report back on performance and then close this thread. Appreciate all the support.
-
AuthorPosts
You must be logged in to reply to this topic.