Introduction
Building a custom neural model for extracting specific fields from resumes allows for efficient processing and standardization, ensuring compatibility with ATS (Application Tracking System). Utilizing Azure Document Intelligence Studio, we can develop a robust solution integrated with a .NET API to parse resumes and extract key fields, including Personal Details, Summary, Education, Work Experience, Projects, Skills, and Certifications. This information is then displayed in an Angular application with a static resume format.
1. Creating the Model
Gather Resumes
First, collect a bunch of resumes to train our model. Make sure they cover most of the fields we need, like Personal Details, Summary, Education, Work Experience, Projects, Skills, and Certifications.
You can find a lot of ATS resume templates on the following websites.
You can only upload files in PDF and image formats.
Train the Neural Model
Go to the Azure Portal and create a Document Intelligence resource.
Then, go to Azure Document Intelligence Studio and click on Custom Extraction Model.
Click on Create a project.
Write the project name and description. Click on Continue.
Choose the Document Intelligence resource we created previously from the Azure Portal. Click on Continue.
Choose an existing Storage account or create a new one. Then select the blob container and click on Continue.
Review the details and click on Create Project.
Now, you must upload the resume you downloaded to the label. Make sure you have a variety of templates with most of the sections, such as Education, Work Experience, Skills, etc.
Once you upload the file, click on Run Layout to proceed with the OCR process.
For this example, we are going to add fields and tables. Click on Add a field / Field.
To label a field, we just need to click on each word that belongs to the tag we added and then click on the tag name. Repeat the same for the rest of the single fields.
In this case, we have a tag that will have a lot of text, such as the Summary field. We can use the Draw region option next to the Auto label. With this feature, we mark the section as a rectangle and then click on the tag name. Finally, click on End drawing.
For the fields that belong to a table, such as Education, Work Experience, Skills, Certifications, and Projects, we must use the Table field.
Write the table name, for this example is WorkExperiencie and the type is Dynamic. Click on Create.
You must rename the fields that are created as default and add the rest of the fields.
Each field of the table and the single fields have a specific type, string, by default. But for dates, you must change to date with a not-specified format because you could have dates just with the year and, in other cases, just the year and month.
Finalize the labeling process for the Work Experience and click on the X icon.
Repeat the same for the rest of the sections with tables. For the Skills, make sure each skill is in a different row in the table you create for labeling.
Make sure all the fields you added are labeled with the resume you provided. Of course, if the resume doesn't have complete information such as certifications, skills, etc, it doesn't matter, and the field will be empty. However, most resumes must have at least the common sections.
- Personal Details
- Education
- Work Experience
The more and the more variety, as long as they are compatible with ATS, the better. Then, at the top right, click on Train. Write the Model ID. The build mode will be Neural because we will have different templates. Click on Train.
This process could take between 20 minutes to 1 hour. It depends on the amount of files to be trained and the fields. You can create as many models as you can, and in each version, you could add more fields or improve the ones with lower confidence scores from the testing.
Test and Adjust
From the left sidebar, click on Test and run the model with a few test resumes. Tweak it until it consistently extracts the right information. To the right, you will see the extracted fields with their values and confidence scores.
Once you're comfortable with the results of your model, it's time to implement it in an app. In this case, from a .NET API and then consume it from an Angular app. Go to the Azure Portal and copy the key and endpoint for the Document Intelligence resource.
2. Building the .NET API
- Set Up the API: Create a .NET API to interact with our custom model. Use Azure's SDK (Azure.AI.FormRecognizer) to connect the API to the model.
- Extract Fields: When a resume is uploaded, the API sends it to the model and gets back the extracted fields in a neat JSON format.
- Handle Errors: Build in some error handling for cases where the resume format is tricky or some fields are missing.
You can reuse the project I created for this demo.
In the appsettings.Development.json, add your API KEY, endpoint, and model ID.
"AzureDocumentIntelligence": {
"endpoint": "",
"apiKey": "",
"modelId": ""
}
That's it. Just run the API in the IIS Express and test the api/resume endpoint. This is the schema for the response.
{
"personalDetails": {
"fullName": "string",
"email": "string",
"cellphone": "string",
"location": "string",
"summary": "string"
},
"socialNetworks": {
"linkedIn": "string",
"gitHub": "string"
},
"education": [
{
"school": "string",
"degree": "string",
"location": "string",
"startDate": "2024-10-13",
"endDate": "2024-10-13"
}
],
"workExperience": [
{
"position": "string",
"company": "string",
"location": "string",
"startDate": "2024-10-13",
"endDate": "2024-10-13",
"description": "string"
}
],
"skills": [
"string"
],
"projects": [
{
"name": "string",
"startDate": "2024-10-13",
"endDate": "2024-10-13",
"description": "string",
"url": "string"
}
],
"certifications": [
{
"name": "string",
"issueDate": "2024-10-13",
"issuingOrganization": "string"
}
],
"languages": [
{
"name": "string",
"level": "string"
}
]
}
3. Displaying Data in Angular
- Angular Basics: Start with a basic Angular setup. Make sure it’s ready to communicate with the .NET API.
- Create a Template: Design a static template in Angular to show the extracted info. This template will have sections for Personal Details, Summary, Education, Work Experience, Projects, Skills, and Certifications.
- Bind the Data: Use Angular’s data-binding features to dynamically display the extracted resume fields in the template.
If you're using the same project from my GitHub repository, you just need to run the Angular app executing ng serve after installing the dependencies. If you want to change the port for the .NET API, replace it in the environment.ts file.
export const environment = {
production: false,
baseUrl: 'https://localhost:44325',
};
From the Angular app, you will always see the same format with the extracted fields for the resume uploaded to the .NET API.
Conclusion
By integrating Azure AI Document Intelligence Studio with a .NET API, we can create a powerful tool for extracting resume details. This tool not only ensures compatibility with ATS but also makes the resume review process much smoother. Displaying the extracted information in an Angular app provides an easy-to-read format for recruiters.
Automating the resume sorting process saves time and resources, letting recruiters focus on finding the best candidates. It’s all about making life a little easier, one resume at a time.