Resume Processing with Azure Document Intelligence Studio, .NET and Angular

Introduction

Building a custom neural model for extracting specific fields from resumes allows for efficient processing and standardization, ensuring compatibility with ATS (Application Tracking System). Utilizing Azure Document Intelligence Studio, we can develop a robust solution integrated with a .NET API to parse resumes and extract key fields, including Personal Details, Summary, Education, Work Experience, Projects, Skills, and Certifications. This information is then displayed in an Angular application with a static resume format.

1. Creating the Model


Gather Resumes

First, collect a bunch of resumes to train our model. Make sure they cover most of the fields we need, like Personal Details, Summary, Education, Work Experience, Projects, Skills, and Certifications.

You can find a lot of ATS resume templates on the following websites.

You can only upload files in PDF and image formats.

Train the Neural Model

Go to the Azure Portal and create a Document Intelligence resource.

Document Intelligence

Then, go to Azure Document Intelligence Studio and click on Custom Extraction Model.

 Extraction Model

Click on Create a project.

Project

Write the project name and description. Click on Continue.

Description

Choose the Document Intelligence resource we created previously from the Azure Portal. Click on Continue.

Azure Portal

Choose an existing Storage account or create a new one. Then select the blob container and click on Continue.

 Storage account

Review the details and click on Create Project.

Create Project

Now, you must upload the resume you downloaded to the label. Make sure you have a variety of templates with most of the sections, such as Education, Work Experience, Skills, etc.

Label Data

Once you upload the file, click on Run Layout to proceed with the OCR process.

Run layout

For this example, we are going to add fields and tables. Click on Add a field / Field.

Add fields

To label a field, we just need to click on each word that belongs to the tag we added and then click on the tag name. Repeat the same for the rest of the single fields.

Bloomberg

In this case, we have a tag that will have a lot of text, such as the Summary field. We can use the Draw region option next to the Auto label. With this feature, we mark the section as a rectangle and then click on the tag name. Finally, click on End drawing.

End drawing

For the fields that belong to a table, such as Education, Work Experience, Skills, Certifications, and Projects, we must use the Table field.

Write the table name, for this example is WorkExperiencie and the type is Dynamic. Click on Create.

Dynamic

You must rename the fields that are created as default and add the rest of the fields.

WorkExperience

Each field of the table and the single fields have a specific type, string, by default. But for dates, you must change to date with a not-specified format because you could have dates just with the year and, in other cases, just the year and month.

StartDate

Finalize the labeling process for the Work Experience and click on the X icon.

X icon

Repeat the same for the rest of the sections with tables. For the Skills, make sure each skill is in a different row in the table you create for labeling.

Draw Region

Make sure all the fields you added are labeled with the resume you provided. Of course, if the resume doesn't have complete information such as certifications, skills, etc, it doesn't matter, and the field will be empty. However, most resumes must have at least the common sections.

  • Personal Details
  • Education
  • Work Experience

Add

The more and the more variety, as long as they are compatible with ATS, the better. Then, at the top right, click on Train. Write the Model ID. The build mode will be Neural because we will have different templates. Click on Train.

Model ID

This process could take between 20 minutes to 1 hour. It depends on the amount of files to be trained and the fields. You can create as many models as you can, and in each version, you could add more fields or improve the ones with lower confidence scores from the testing.

Models

Test and Adjust

From the left sidebar, click on Test and run the model with a few test resumes. Tweak it until it consistently extracts the right information. To the right, you will see the extracted fields with their values and confidence scores.

Test and Adjust

Once you're comfortable with the results of your model, it's time to implement it in an app. In this case, from a .NET API and then consume it from an Angular app. Go to the Azure Portal and copy the key and endpoint for the Document Intelligence resource.

2. Building the .NET API

  • Set Up the API: Create a .NET API to interact with our custom model. Use Azure's SDK (Azure.AI.FormRecognizer) to connect the API to the model.
  • Extract Fields: When a resume is uploaded, the API sends it to the model and gets back the extracted fields in a neat JSON format.
  • Handle Errors: Build in some error handling for cases where the resume format is tricky or some fields are missing.

You can reuse the project I created for this demo.

In the appsettings.Development.json, add your API KEY, endpoint, and model ID.

"AzureDocumentIntelligence": {
  "endpoint": "",
  "apiKey": "",
  "modelId": ""
}

That's it. Just run the API in the IIS Express and test the api/resume endpoint. This is the schema for the response.

{
  "personalDetails": {
    "fullName": "string",
    "email": "string",
    "cellphone": "string",
    "location": "string",
    "summary": "string"
  },
  "socialNetworks": {
    "linkedIn": "string",
    "gitHub": "string"
  },
  "education": [
    {
      "school": "string",
      "degree": "string",
      "location": "string",
      "startDate": "2024-10-13",
      "endDate": "2024-10-13"
    }
  ],
  "workExperience": [
    {
      "position": "string",
      "company": "string",
      "location": "string",
      "startDate": "2024-10-13",
      "endDate": "2024-10-13",
      "description": "string"
    }
  ],
  "skills": [
    "string"
  ],
  "projects": [
    {
      "name": "string",
      "startDate": "2024-10-13",
      "endDate": "2024-10-13",
      "description": "string",
      "url": "string"
    }
  ],
  "certifications": [
    {
      "name": "string",
      "issueDate": "2024-10-13",
      "issuingOrganization": "string"
    }
  ],
  "languages": [
    {
      "name": "string",
      "level": "string"
    }
  ]
}

3. Displaying Data in Angular

  • Angular Basics: Start with a basic Angular setup. Make sure it’s ready to communicate with the .NET API.
  • Create a Template: Design a static template in Angular to show the extracted info. This template will have sections for Personal Details, Summary, Education, Work Experience, Projects, Skills, and Certifications.
  • Bind the Data: Use Angular’s data-binding features to dynamically display the extracted resume fields in the template.

If you're using the same project from my GitHub repository, you just need to run the Angular app executing ng serve after installing the dependencies. If you want to change the port for the .NET API, replace it in the environment.ts file.

export const environment = {
  production: false,
  baseUrl: 'https://localhost:44325',
};

From the Angular app, you will always see the same format with the extracted fields for the resume uploaded to the .NET API.

 .NET API

Conclusion

By integrating Azure AI Document Intelligence Studio with a .NET API, we can create a powerful tool for extracting resume details. This tool not only ensures compatibility with ATS but also makes the resume review process much smoother. Displaying the extracted information in an Angular app provides an easy-to-read format for recruiters.

Automating the resume sorting process saves time and resources, letting recruiters focus on finding the best candidates. It’s all about making life a little easier, one resume at a time.


Similar Articles