Introduction
C# Corner provides RSS feeds (maximum 100 posts only) with various data about their authors. We will use one of these RSS feeds to fetch the author's post details and scrape data such as article category, number of views, and number of likes for each post and save to a SQL Server database. In our application, user can give an author id and fetch the post data for that author and save to the database. Once we populate the data, we can fetch the data with different LINQ queries using Entity Framework and show the data in Angular application as Chart or some other formats.
I have published more than 70 articles so far on C# Corner and just wanted to analyze my posts' reach to the audience. I have searched various ways to fetch data from C# Corner site and luckily got RSS feeds. But RSS feeds do not contain the data like the number of views or number of likes for a post. Again, I was searching for a way to get this kind of data and finally, decided to scrape each post and get individual data. Though it is a time-consuming process, I have achieved my goal to get the desired data from the C# Corner site. I believe this will be useful to other authors also. Hence, I am sharing the source code of the app attached to this article.
Create a Web API project in Visual Studio
We need to create a Web API project for populating the data to SQL server database and fetch various data for our Angular application.
Firstly, let us create an “ArticleMatrices” table in SQL Server database using the below SQL script.
- CREATE TABLE [dbo].[ArticleMatrices](
- [Id] [int] IDENTITY(1,1) NOT NULL,
- [AuthorId] [nvarchar](50) NULL,
- [Author] [nvarchar](50) NULL,
- [Link] [nvarchar](250) NULL,
- [Title] [nvarchar](250) NULL,
- [Type] [nvarchar](50) NULL,
- [Category] [nvarchar](50) NULL,
- [Views] [nvarchar](50) NULL,
- [ViewsCount] [decimal](18, 0) NULL,
- [Likes] [int] NULL,
- [PubDate] [date] NULL,
- CONSTRAINT [PK_ArticleMatrices] PRIMARY KEY CLUSTERED
- (
- [Id] ASC
- ))
Create a Web API project in Visual Studio
Open Visual Studio and create a new web application with “ASP.NET Web Application” template. You can also choose the “Web API” option.
After clicking the OK button, our project will be created with default dependencies.
We can install “HtmlAgilityPack” NuGet library to scrape the data.
We must enable CORS in this Web API project to access services from Angular 8 application. Hence, we install the "Microsoft.AspNet.WebApi.Cors“ library also.
Please note, this library will install other three related libraries as well.
We now need to enable CORS in “WebApiConfig” file. I have enabled CORS for all domains. In real applications, you can restrict it with a specific domain.
WebApiConfig.cs
- using System.Web.Http;
- using System.Web.Http.Cors;
-
- namespace AnalyticsWebAPI
- {
- public static class WebApiConfig
- {
- public static void Register(HttpConfiguration config)
- {
-
- var cors = new EnableCorsAttribute("*", "*", "*");
- config.EnableCors(cors);
-
-
- config.MapHttpAttributeRoutes();
-
- config.Routes.MapHttpRoute(
- name: "DefaultApi",
- routeTemplate: "api/{controller}/{id}",
- defaults: new { id = RouteParameter.Optional }
- );
- }
- }
- }
We need some model classes to fetch data from SQL server and populate C# Corner author information. For simplicity, I will create a single class file “Models” and create all classes inside this file.
Models.cs
- using System;
-
- namespace AnalyticsWebAPI.Models
- {
- public class ArticleMatrix
- {
- public int Id { get; set; }
- public string AuthorId { get; set; }
- public string Author { get; set; }
- public string Link { get; set; }
- public string Title { get; set; }
- public string Type { get; set; }
- public string Category { get; set; }
- public string Views { get; set; }
- public decimal? ViewsCount { get; set; }
- public int Likes { get; set; }
- public DateTime PubDate { get; set; }
- }
- public class Feed
- {
- public string Link { get; set; }
- public string Title { get; set; }
- public string FeedType { get; set; }
- public string Author { get; set; }
- public string Content { get; set; }
- public DateTime PubDate { get; set; }
-
- public Feed()
- {
- Link = "";
- Title = "";
- FeedType = "";
- Author = "";
- Content = "";
- PubDate = DateTime.Today;
- }
- }
- public class Authors
- {
- public string AuthorId { get; set; }
- public string Author { get; set; }
- public int Count { get; set; }
- }
-
- public class Category
- {
- public string Name { get; set; }
- public int Count { get; set; }
- }
- }
I have created “ArticleMatrix”,“Feed”,” Authors”, “Category” model classes. Each class has its own significance in our application.
We can create a Web API controller using scaffolding.
Choose “Web API 2 Controller with actions, using Entity Framework” option and choose a model class from the drop-down list to create a new controller.
We have already created “ArticleMatrices” table in SQL server and we have the same set of properties inside “ArticleMatrix” model class.
We will choose this class to create a controller. Entity Framework automatically chooses the mapping between the SQL table and model class.
It will automatically create a “SqlDbContext” db context file along with the controller class file. Also, note that in Web.Config, a connection string is also created with default values. You can modify this connection string with your SQL Server and database details.
We can add the methods inside the Web API controller.
CsharpCornerController.cs
- using AnalyticsWebAPI.Models;
- using HtmlAgilityPack;
- using System;
- using System.Collections.Generic;
- using System.Data;
- using System.Globalization;
- using System.IO;
- using System.Linq;
- using System.Net;
- using System.Text;
- using System.Web.Http;
- using System.Xml.Linq;
-
- namespace AnalyticsWebAPI.Controllers
- {
- [RoutePrefix("api/CsharpCorner")]
- public class CsharpCornerController : ApiController
- {
- private SqlDbContext db = new SqlDbContext();
- readonly CultureInfo culture = new CultureInfo("en-US");
-
- [HttpGet]
- [Route("CreatePosts/{authorId}")]
- public bool CreatePosts(string authorId)
- {
- try
- {
- int count = 0;
- XDocument doc = XDocument.Load("https://www.c-sharpcorner.com/members/" + authorId + "/rss");
- var entries = from item in doc.Root.Descendants().First(i => i.Name.LocalName == "channel").Elements().Where(i => i.Name.LocalName == "item")
- select new Feed
- {
- Content = item.Elements().First(i => i.Name.LocalName == "description").Value,
- Link = (item.Elements().First(i => i.Name.LocalName == "link").Value).StartsWith("/") ? "https://www.c-sharpcorner.com" + item.Elements().First(i => i.Name.LocalName == "link").Value : item.Elements().First(i => i.Name.LocalName == "link").Value,
- PubDate = Convert.ToDateTime(item.Elements().First(i => i.Name.LocalName == "pubDate").Value, culture),
- Title = item.Elements().First(i => i.Name.LocalName == "title").Value,
- FeedType = (item.Elements().First(i => i.Name.LocalName == "link").Value).ToLowerInvariant().Contains("blog") ? "Blog" : (item.Elements().First(i => i.Name.LocalName == "link").Value).ToLowerInvariant().Contains("news") ? "News" : "Article",
- Author = item.Elements().First(i => i.Name.LocalName == "author").Value
- };
-
- List<Feed> feeds = entries.OrderByDescending(o => o.PubDate).ToList();
- string urlAddress = string.Empty;
- List<ArticleMatrix> articleMatrices = new List<ArticleMatrix>();
-
- foreach (Feed feed in feeds)
- {
- count++;
- if (count > 100) break;
- urlAddress = feed.Link;
-
- HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
- HttpWebResponse response = (HttpWebResponse)request.GetResponse();
- string strData = "";
-
- if (response.StatusCode == HttpStatusCode.OK)
- {
- Stream receiveStream = response.GetResponseStream();
- StreamReader readStream = null;
-
- if (response.CharacterSet == null)
- {
- readStream = new StreamReader(receiveStream);
- }
- else
- {
- readStream = new StreamReader(receiveStream, Encoding.GetEncoding(response.CharacterSet));
- }
-
- strData = readStream.ReadToEnd();
-
- response.Close();
- readStream.Close();
-
- HtmlDocument htmlDocument = new HtmlDocument();
- htmlDocument.LoadHtml(strData);
-
- ArticleMatrix articleMatrix = new ArticleMatrix
- {
- AuthorId = authorId,
- Author = feed.Author,
- Type = feed.FeedType,
- Link = feed.Link,
- Title = feed.Title,
- PubDate = feed.PubDate
- };
-
- string category = htmlDocument.GetElementbyId("ImgCategory").GetAttributeValue("title", "");
-
- articleMatrix.Category = category;
-
- var view = htmlDocument.DocumentNode.SelectSingleNode("//span[@id='ViewCounts']");
- if (view != null)
- {
- articleMatrix.Views = view.InnerText;
-
- if (articleMatrix.Views.Contains("m"))
- {
- articleMatrix.ViewsCount = decimal.Parse(articleMatrix.Views.Substring(0, articleMatrix.Views.Length - 1)) * 1000000;
- }
- else if (articleMatrix.Views.Contains("k"))
- {
- articleMatrix.ViewsCount = decimal.Parse(articleMatrix.Views.Substring(0, articleMatrix.Views.Length - 1)) * 1000;
- }
- else
- {
- decimal.TryParse(articleMatrix.Views, out decimal viewCount);
- articleMatrix.ViewsCount = viewCount;
- }
- }
- else
- {
- articleMatrix.ViewsCount = 0;
- }
- var like = htmlDocument.DocumentNode.SelectSingleNode("//span[@id='LabelLikeCount']");
- if (like != null)
- {
- int.TryParse(like.InnerText, out int likes);
- articleMatrix.Likes = likes;
- }
- articleMatrices.Add(articleMatrix);
- }
- }
-
- count = 0;
-
- db.ArticleMatrices.RemoveRange(db.ArticleMatrices.Where(x => x.AuthorId == authorId));
-
- foreach (ArticleMatrix articleMatrix in articleMatrices)
- {
- count++;
- db.ArticleMatrices.Add(articleMatrix);
- }
-
- db.SaveChanges();
- return true;
- }
- catch
- {
- return false;
- }
-
- }
-
- [HttpGet]
- [Route("GetAll/{authorId}")]
- public IQueryable<ArticleMatrix> GetAll(string authorId)
- {
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId).OrderByDescending(x => x.PubDate);
- }
-
- [HttpGet]
- [Route("GetAuthors")]
- public IQueryable<Authors> GetAuthors()
- {
- return from x in db.ArticleMatrices.GroupBy(x => x.AuthorId)
- select new Authors
- {
- AuthorId = x.FirstOrDefault().AuthorId,
- Author = x.FirstOrDefault().Author,
- Count = x.Count()
- };
- }
-
- [HttpGet]
- [Route("GetCategory/{authorId}")]
- public IQueryable<Category> GetCategory(string authorId)
- {
- return from x in db.ArticleMatrices.Where(x => x.AuthorId == authorId).GroupBy(x => x.Category)
- select new Category
- {
- Name = x.FirstOrDefault().Category,
- Count = x.Count()
- };
- }
-
- [HttpGet]
- [Route("GetPosts/{authorId}/{category}/{orderBy}")]
- public IQueryable<ArticleMatrix> GetPosts(string authorId, string category, string orderBy)
- {
- var newCategory = category.Replace("~~~", ".").Replace("```", "&").Replace("!!!", "#");
- if (newCategory == "all")
- {
- switch (orderBy)
- {
- case "likes":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId).OrderByDescending(x => x.Likes);
- case "views":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId).OrderByDescending(x => x.ViewsCount);
- case "category":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId).OrderBy(x => x.Category);
- case "type":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId).OrderBy(x => x.Type);
- default:
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId).OrderByDescending(x => x.PubDate);
- }
- }
- else
- {
- switch (orderBy)
- {
- case "likes":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId && x.Category == newCategory).OrderByDescending(x => x.Likes);
- case "views":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId && x.Category == newCategory).OrderByDescending(x => x.ViewsCount);
- case "category":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId && x.Category == newCategory).OrderBy(x => x.Category);
- case "type":
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId && x.Category == newCategory).OrderBy(x => x.Type);
- default:
- return db.ArticleMatrices.Where(x => x.AuthorId == authorId && x.Category == newCategory).OrderByDescending(x => x.PubDate);
- }
- }
- }
-
- protected override void Dispose(bool disposing)
- {
- if (disposing)
- {
- db.Dispose();
- }
- base.Dispose(disposing);
- }
- }
- }
For code simplicity, I have made all the HTTP methods as GET methods only. I have created “CreatePosts”, “GetAll”, “GetAuthors”, “GetCategory”, “GetPosts” methods inside controller class.
The “CreatePosts” method is the most important method and it will populate the data for an author to the database. I have used many HttpAgilityPack properties inside this method. All other methods are self-explanatory.
We have successfully completed the Web API project. We can run the project and check methods inside the controller. Since I have created all the methods as HTTP GET, you can simply check it with any browser itself.
Create an Angular 8 Project using CLI
We can create the Angular 8 project using below command.
ng new AnalyticsAngular8
It will take some time to create all the node packages. We can install Chart.js package in our project using the below command.
npm install chart.js --save
We are all set to start coding in the Angular project.
We also have to install the “bootstrap” library to project.
Import bootstrap class inside the “style.css” file for future usage.
- /* You can add global styles to this file, and also import other style files */
- @import "~bootstrap/dist/css/bootstrap.css";
Le us import “HttpClientModule”, “FormsModule” and “ReactiveFormsModule” inside the app.module file.
app.module.ts
- import { BrowserModule } from '@angular/platform-browser';
- import { NgModule } from '@angular/core';
-
- import { AppRoutingModule } from './app-routing.module';
- import { AppComponent } from './app.component';
-
- import { HttpClientModule } from '@angular/common/http';
- import { ReactiveFormsModule, FormsModule } from '@angular/forms';
-
- @NgModule({
- declarations: [
- AppComponent
- ],
- imports: [
- BrowserModule,
- AppRoutingModule,
- HttpClientModule,
- FormsModule,
- ReactiveFormsModule
- ],
- providers: [],
- bootstrap: [AppComponent]
- })
- export class AppModule { }
Modify the “app.component” component file with the below code.
app.component.ts
- import { Component, OnInit } from '@angular/core';
- import { HttpClient } from '@angular/common/http';
- import { Chart } from 'chart.js';
- import { FormGroup, FormBuilder } from '@angular/forms';
-
- @Component({
- selector: 'app-root',
- templateUrl: './app.component.html',
- styleUrls: ['./app.component.css']
- })
- export class AppComponent implements OnInit {
- constructor(private http: HttpClient, private fb: FormBuilder) { }
-
- authors: Author[] = [];
- posts: Post[] = [];
- authorForm: FormGroup;
- chartClicked: boolean;
- showDetails: boolean;
- showLoader: boolean;
-
- categories: string[] = [];
- counts: number[] = [];
- chart1: Chart;
- backColor: string[] = [];
- totalPosts: number;
-
- selectedCategory: string;
- selectedAuthor: string;
- selectedCount: number;
- selectedAuthorId: string;
-
- private baseUrl = 'http://localhost:4000/api/csharpcorner';
-
- ngOnInit(): void {
- this.authorForm = this.fb.group({
- authorId: '',
- chartType: 'pie',
- author: '',
- category: '',
- orderBy: 'pubDate'
- });
- this.showDetails = false;
- this.showLoader = false;
- this.showAuthors();
- }
-
- showAuthors() {
- this.http.get<Author[]>(this.baseUrl + '/getauthors').subscribe(result => {
- this.authors = result;
- }, error => console.error(error));
- }
-
- fillData() {
- this.fillCategory();
- }
-
- fillCategory() {
- if (this.chart1) this.chart1.destroy();
- this.showDetails = false;
- this.categories = [];
- this.counts = [];
- this.chartClicked = true;
- this.authorForm.patchValue({
- category: ''
- });
- this.totalPosts = 0;
- this.selectedAuthorId = this.authorForm.value.author.AuthorId;
- this.http.get<Categroy[]>(this.baseUrl + '/getcategory/' + this.authorForm.value.author.AuthorId).subscribe(result => {
- result.forEach(x => {
- this.totalPosts += x.Count;
- this.categories.push(x.Name);
- this.counts.push(x.Count);
- this.backColor.push(this.getRandomColor());
- });
- if (result.length == 0 || this.selectedAuthorId == undefined) return;
- this.chart1 = new Chart('canvas1', {
- type: this.authorForm.value.chartType,
- data: {
- labels: this.categories,
- datasets: [
- {
- data: this.counts,
- borderColor: '#3cba9f',
- backgroundColor: this.backColor,
- fill: true
- }
- ]
- },
- options: {
- legend: {
- display: false
- },
- scales: {
- xAxes: [{
- display: false
- }],
- yAxes: [{
- display: false
- }],
- }
- }
- });
-
- }, error => console.error(error));
- }
-
- clickChart(event: any) {
- var evt = this.chart1.chart.getElementAtEvent(event);
- if (evt.length == 0) return;
- this.chartClicked = true;
- this.authorForm.patchValue({
- category: this.categories[evt[0]._index]
- });
- this.fillDetails();
- }
-
- populateData() {
- if (this.authorForm.value.authorId == '' || this.authorForm.value.authorId == undefined) {
- alert('Please give a valid Author Id');
- return;
- }
- this.showLoader = true;
- if (this.chart1) this.chart1.destroy();
- this.chartClicked = true;
- this.authorForm.patchValue({
- chartType: 'pie',
- author: ''
- });
- this.showDetails = false;
- this.http.get(this.baseUrl + '/CreatePosts/' + this.authorForm.value.authorId).subscribe(result => {
- this.showAuthors();
- this.showLoader = false;
- if (result == true) {
- alert('Author data successfully populated!');
- }
- else {
- alert('Invalid Author Id');
- }
- this.authorForm.patchValue({
- authorId: ''
- });
- }, error => console.error(error));
-
- }
-
- categorySelected() {
- if (this.chartClicked) {
- this.chartClicked = false;
- return;
- }
- this.fillDetails();
- }
-
- fillDetails() {
- var category = this.authorForm.value.category;
- var newCategory = category.replace('.', "~~~").replace('&', '```').replace('#', '!!!');
- this.http.get<Post[]>(this.baseUrl + '/getposts/' + this.authorForm.value.author.AuthorId + '/' + newCategory + '/' + this.authorForm.value.orderBy).subscribe(result => {
- this.posts = result;
- this.selectedCategory = (category == 'all') ? 'All' : category;
- this.selectedCount = result.length;
- this, this.selectedAuthor = this.authorForm.value.author.Author;
- this.showDetails = true;
- }, error => console.error(error));
- }
-
- getRandomColor() {
- var letters = '0123456789ABCDEF';
- var color = '#';
- for (var i = 0; i < 6; i++) {
- color += letters[Math.floor(Math.random() * 16)];
- }
- return color;
- }
-
- private delay(ms: number) {
- return new Promise(resolve => setTimeout(resolve, ms));
- }
- }
-
- interface Author {
- AuthorId: string;
- Author: string;
- Count: number;
- }
-
- interface Categroy {
- Name: string;
- Count: number;
- }
-
- interface Post {
- Link: string;
- Title: string;
- Type: string;
- Category: string;
- Views: string;
- ViewsCount: number;
- Likes: number;
- PubDate: Date;
- }
Also, modify the corresponding HTML and CSS files with the below codes.
app.component.html
app.component.css
-
-
- .file-loader {
- background-color: rgba(0, 0, 0, .5);
- overflow: hidden;
- position: fixed;
- top: 0;
- left: 0;
- right: 0;
- bottom: 0;
- z-index: 100000 !important;
- }
-
- .upload-loader {
- position: absolute;
- width: 60px;
- height: 60px;
- left: 50%;
- top: 50%;
- transform: translate(-50%, -50%);
- }
-
- .upload-loader .loader {
- border: 5px solid #f3f3f3 !important;
- border-radius: 50%;
- border-top: 5px solid #005eb8 !important;
- width: 100% !important;
- height: 100% !important;
- -webkit-animation: spin 2s linear infinite;
- animation: spin 2s linear infinite;
- }
-
- @-webkit-keyframes spin {
- 0% {
- -webkit-transform: rotate(0deg);
- }
- 100% {
- -webkit-transform: rotate(360deg);
- }
- }
-
- @keyframes spin {
- 0% {
- transform: rotate(0deg);
- }
- 100% {
- transform: rotate(360deg);
- }
- }
-
-
We have completed the coding part in the Angular project also. We can run both, Web API project and Angular project, now.
Enter an author id and click “Populate Author Data” button to fetch author post details from the C# Corner site.
It will take some time to scrape data from the site based on the number of posts this author has published.
After completing the data population, the user can select an author name from the drop-down list.
Whenever you process a new author's data, the author name will be added to the drop-down list automatically. After choosing the author name, the post's category will be shown as a chart. You can see the category name as a tooltip in the chart.
It will add posted categories to the category drop-down also.
You can choose any of these categories and will get the entire list of posts for that category.
Currently, I have added three types of Charts - “Pie”, “Doughnut” and “Polar Area”.
You can view the chart in any of these types. Below is a Polar Type chart. The previous chart was a Pie chart.
You can choose a category by clicking on a chart or choosing from the category drop-down. The entire posts for that category will be shown as below.
You can view the posts in different orders by choosing from "Order By" drop-down.
I have hosted the Angular app on Azure. You can analyze your post details through this
Live App.
Conclusion
In this article, we saw how to scrape C# Corner author post information from the site using “HtmlAgilityPack” library and Web API service. We also discussed how to show this data in an Angular 8 application. We have used Chart.js library to show different types of charts and showed the entire posts in a grid. Please note that currently, C# Corner feeds give a maximum of 100 post details only. Hence, we can analyze a maximum of 100 posts only. I have attached the source code of both the Angular and Web API projects with this article. Please check from your side and give your valuable feedback on this article and application.