Build and Deploy a Regression Model of House Renting in India using Azure AutoML

Introduction

With the rising demand for affordable housing and the evolving rental market in India, accurate prediction of house rents has become crucial for landlords, tenants, and real estate professionals alike. Thanks to advancements in machine learning and cloud computing, building and deploying regression models has become more accessible than ever. In this article, we will explore how you can utilize Azure AutoML, a powerful automated machine-learning platform, to develop a regression model for house rent in India.

Understanding Azure AutoML

Azure AutoML, part of Microsoft's Azure Machine Learning service, simplifies the process of developing machine learning models by automating the time-consuming and complex tasks involved. It provides a user-friendly interface, making it accessible to both beginners and experienced data scientists. By leveraging AutoML, you can rapidly experiment with various algorithms, feature engineering techniques, and hyperparameter settings, ultimately leading to a highly accurate regression model.

Building a Regression Model with Azure AutoML


Set up Azure AutoML workspace

Go to Azure Portal and search Azure Machine Learning, then click on Create dropdown and choose New workspace.

regression model of house renting Azure AutoML

Choose the subscription, resource group, and region, and type the workspace name. Then, click on Review + Create.

regression model of house renting Azure AutoML

Once the workspace is created, click on Launch Studio.

regression model of house renting Azure AutoML

From Machine Learning Studio, click on the New button and choose Automated ML job.

regression model of house renting Azure AutoML

Create data asset

We will upload a dataset from Kaggle; visit this page to download it and see further details. Click on Create.

This is the dataset glossary:

  • BHK: Number of Bedrooms, Hall, Kitchen.
  • Rent: Rent of the Houses/Apartments/Flats.
  • Size: Size of the Houses/Apartments/Flats in Square Feet.
  • Floor: Houses/Apartments/Flats situated on which Floor and Total Number of Floors (Example: Ground out of 2, 3 out of 5, etc.)
  • Area Type: Size of the Houses/Apartments/Flats calculated on either Super Area or Carpet Area or Build Area.
  • Area Locality: Locality of the Houses/Apartments/Flats.
  • City: The city where the Houses/Apartments/Flats are Located.
  • Furnishing Status: Furnishing Status of the Houses/Apartments/Flats, either it is Furnished or Semi-Furnished or Unfurnished.
  • Tenant Preferred: The type of Tenant Preferred by the Owner or Agent.
  • Bathroom: Number of Bathrooms.
  • Point of Contact: Whom should you contact for more information regarding the Houses/Apartments/Flats?

regression model of house renting Azure AutoML

Type the data asset name and description next.

regression model of house renting Azure AutoML

Azure provides seamless integration with various data storage options, including Azure Blob Storage, Azure Data Lake Storage, and even directly from your local machine. Click on From local files and Next.

regression model of house renting Azure AutoML

It will be selected as the default Azure Blob Storage; choose the datastore that ends with the blob store. Click on Next.

regression model of house renting Azure AutoML

Upload the csv file you downloaded in the previous steps. Click on Next.

regression model of house renting Azure AutoML

You will see a preview of the dataset uploaded. Click on Next.

regression model of house renting Azure AutoML

You can disable the columns you want to remove because they aren't going to be used for training purposes. We won't use Posted On column, so let's disable it and click on Next.

regression model of house renting Azure AutoML

You will see a summary of the settings for your data asset. If you are ok with them, click on Create.

regression model of house renting Azure AutoML

After a few seconds, the data asset will appear, select it and click on Next.

regression model of house renting Azure AutoML

 

Job Configuration

After we created our data asset, we need to create our experiment. So, type the experiment name, select the target column (Rent), and leave Compute cluster as the compute type. Click on New to create a compute cluster.

regression model of house renting Azure AutoML

For this example, we are going to use the following configuration,

  • Virtual machine tier: Dedicated
  • Virtual machine type: CPU
  • Virtual machine size: Standard_DS3_v2

regression model of house renting Azure AutoML

Type the compute name, and set 0 and 1 as the minimum and maximum number of nodes. Click on Create.

regression model of house renting Azure AutoML

Once the compute cluster is created, it will be selected as default. Click on Next.

regression model of house renting Azure AutoML

Task and Settings

Select Regression and click on View additional configuration settings.

regression model of house renting Azure AutoML

Just for this demo, we will change the Training job time to one hour. Click on Save, and from the Select task and settings step, click on Next.

regression model of house renting Azure AutoML

We will leave Auto as the Validation type and select Test split with 10% as the Test data asset. Click on Finish.

regression model of house renting Azure AutoML

Automated Model Training

A job was created, and we must wait around one hour based on the training job time we set up previously.

regression model of house renting Azure AutoML

Once the training is completed, you will see the job duration (1h 5m 23.38s) and the best model (VotingEnsemble) summary. Click on the Models tab.

regression model of house renting Azure AutoML

Model Selection and Evaluation

In this tab, you will see all the models (more than 40) created for the job. Also, each model has the same metric (Normalized root mean squared error) to be compared. Apart from the metric, each model has the hyperparameters used to get that score. Click on View explanation.

regression model of house renting Azure AutoML

From the Aggregate feature importance tab, you will see the top features by their importance. In the below screenshot, you can see the top 6 where the columns City and Size are almost at the same level. Click on the Model Performance tab.

regression model of house renting Azure AutoML

From this tab, you can evaluate the performance of your model by exploring the distribution of your prediction values and the values of your model performance metrics. Click on the Metrics tab.

regression model of house renting Azure AutoML

From this tab, you will see all the metrics for your model. We just want to pay attention to two metrics,

  • r2 score: Since 0,528 is between 0,5 and 0,75. This means a good amount of variance is explained.
  • Root Mean Squared Error (RMSE): 5.3557e+4 in decimal notation is equal to 53557. RMSE can be interpreted as the standard deviation of the unexplained variance, and it has the useful property of being in the same units as the response variable. Lower RMSE values indicate a better fit. This value is too high because the maximum rent from the dataset is 3500000, and the minimum is 1200. So there is a big difference.

regression model of house renting Azure AutoML

Also, you can see the following charts:

  • Predicted vs True: It plots the relationship between the target feature (true/actual values) and the model's predictions. The true values are binned along the x-axis, and for each bin, the mean predicted value is plotted with error bars. This allows you to see if a model is biased toward predicting certain values. The line displays the average prediction, and the shaded area indicates the variance of predictions around that mean.

regression model of house renting Azure AutoML

  • Residual Histogram: It is a histogram of the prediction errors (residuals) generated for regression and forecasting experiments. Residuals are calculated as y_predicted - y_true for all samples and then displayed as a histogram to show model bias.

regression model of house renting Azure AutoML

Visit this page to know further details regarding evaluation results.

Deployment

Go to the Overview tab and click on the Deploy dropdown and then the Real-time endpoint.

regression model of house renting Azure AutoML

We will use the quick deployment settings; if you want to use the advanced settings, click on More options. Type the endpoint name and set the instance count to 1. Click on Deploy.

regression model of house renting Azure AutoML

From the left sidebar, go to Endpoints and click the one you created.

regression model of house renting Azure AutoML

Wait a few minutes for the endpoint to be deployed. Click on the Test tab.

regression model of house renting Azure AutoML

You can test more than one case at once.

regression model of house renting Azure AutoML

Go to the Consume tab to test the endpoint from Postman. Copy the deployment model, REST endpoint, and the primary key.

regression model of house renting Azure AutoML

Open Postman or any similar software to test REST APIs. You must copy the primary key as the Bearer token in Authorization, and in the Headers, add azureml-model-deployment and copy the deployment model.

regression model of house renting Azure AutoML

This is the full body for the above request.

{
  "Inputs": {
    "data": [
      {
        "BHK": 2,
        "Size": 675,
        "Floor": "1 out of 2",
        "Area Type": "Carpet Area",
        "Area Locality": "Santoshpur",
        "City": "Kolkata",
        "Furnishing Status": "Unfurnished",
        "Tenant Preferred": "Family",
        "Bathroom": 1,
        "Point of Contact": "Contact Owner"
      },
      {
        "BHK": 1,
        "Size": 425,
        "Floor": "1 out of 2",
        "Area Type": "Super Area",
        "Area Locality": "Sreebhumi",
        "City": "Kolkata",
        "Furnishing Status": "Semi-Furnished",
        "Tenant Preferred": "Bachelors",
        "Bathroom": 1,
        "Point of Contact": "Contact Agent"
      },
      {
        "BHK": 2,
        "Size": 1025,
        "Floor": "1 out of 2",
        "Area Type": "Built Area",
        "Area Locality": "Chromepet, GST Road",
        "City": "Chennai",
        "Furnishing Status": "Furnished",
        "Tenant Preferred": "Bachelors/Family",
        "Bathroom": 2,
        "Point of Contact": "Contact Owner"
      }
    ]
  },
  "GlobalParameters": 1.0
}

For this request, we used the existing records from the dataset but added 25 square feet more in the size column. In the following table, you will see the comparison.

Record Original Size Request Size Original Rent Response Rent
1 650 675 7900 7262,59
2 400 425 8000 6900,04
3 1000 1025 15000 19090,62

Only incrementing the size value should increase the rent as well, but in the first two records, that wasn't the case. To fix that, you can try to remove the records from the dataset with a rent greater or equal to 1000000. There are 3 records within this criteria.

Visit this page to get further details on all the steps we covered in this article.

Conclusion

Thanks to Azure AutoML, building and deploying a regression model for house renting in India has become more accessible, even for those without extensive machine learning expertise. By leveraging the power of automated machine learning, you can develop accurate models that help both landlords and tenants make informed decisions based on predicted rental prices. Azure AutoML simplifies the process, enabling you to focus on extracting valuable insights from the data rather than getting lost in the complexities of model development. So, why not harness the potential of Azure AutoML and unlock the predictive power of machine learning in the Indian rental market?

Thanks for reading

Thank you very much for reading. I hope you found this article interesting and may be useful in the future. If you have any questions or ideas you need to discuss, it will be a pleasure to collaborate and exchange knowledge.