TECHNOLOGIES
FORUMS
JOBS
BOOKS
EVENTS
INTERVIEWS
Live
MORE
LEARN
Training
CAREER
MEMBERS
VIDEOS
NEWS
BLOGS
Sign Up
Login
No unread comment.
View All Comments
No unread message.
View All Messages
No unread notification.
View All Notifications
Answers
Post
An Article
A Blog
A News
A Video
An EBook
An Interview Question
Ask Question
Forums
Monthly Leaders
Forum guidelines
John Riker
NA
85
16k
HTML Agility Pack get specific URL's
Nov 1 2020 12:15 PM
Trying to parse a webpage and get all the URL's for the shows listed and then do some work with them. So in this case take:
https://abc.com/shows/general-hospital
I want to grab all the general hospital show links listed on this page. Can't really trust it but each one right now shows 36m in the name for the link but of course a video could be shorter or longer. Any way to grab all the top row of data? May change a bit Monday as it will have part of October and part of November.
Right now I have this and returns the text of all links on the entire site:
static
void
Main()
{
WebClient webClient =
new
WebClient();
var page = webClient.DownloadString(
"https://abc.com/shows/general-hospital"
);
HtmlAgilityPack.HtmlDocument doc =
new
HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(page);
foreach
(var td
in
doc.DocumentNode.SelectNodes(
"//a[@href]"
))
{
Console.WriteLine(td.InnerText);
}
}
So basically if you look at that page source would need to target:
<script data-react-helmet=
"true"
type=
"application/ld+json"
>
[{
"@context"
:
"http://schema.org/"
,
"@type"
:
"ItemList"
,
"itemlistElement"
:
[ [{
"@type"
:
"ListItem"
,
"position"
: 1,
"name"
:
"November 2020"
,
"item"
: []
}],[{
"@type"
:
"ListItem"
,
"position"
: 2,
"name"
:
"October 2020"
,
"item"
: [{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/30/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/30-general-hospital-103020"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/29/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/29-general-hospital-102920"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/28/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/28-general-hospital-102820"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/27/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/27-general-hospital-102720"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/26/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/26-general-hospital-102620"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/23/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/23-general-hospital-102320"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/22/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/22-general-hospital-102220"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/21/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/21-general-hospital-102120"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/20/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/20-general-hospital-102020"
},{
"@type"
:
"Thing"
,
"name"
:
"General Hospital 10/19/20"
,
"url"
:
"www.abc.com/shows/general-hospital/episode-guide/2020-10/19-general-hospital-101920"
}]
}],[{
And would probably want the name and url from each section.
Reply
Answers (
1
)
How to read sql data to runtime added combobox in c# winforms
Getting radio button value & text without using asp.net controls