Introduction
The previous sessions contained a great discussion on the topics such as Learn
Select Nodes using Html Agility Pack. In the present one, we would be learning yet another important topic which is
HTML Manipulation using html agility pack. In the world of dynamic HTML requirements, it is very much a part of the job to be able to manipulate the HTML content according to the demography and needs of the clients. There are four important properties of an HTML with which we can modify or change the complete contents on the fly. Each of them have been described below along with few methods that are also available to use.
Inner HTML
This is a public method, which means it could be accessed from anywhere. Using this, you could either set or get the HTML content present within the boundaries of opening and closing tags of the mentioned HTML object. If getting the content is your objective, then you would be obtaining it in a string data type. One thing to note is that the
InnerHtml in html agility pack is indeed a member of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>.Net Core</h1>
This is <b>C#, ASP.Net</b> paragraph
<h1>
.Net Core with Angular</h1>
This is <b>HTML Agility Pack</b> sample
</body>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/p");
foreach (var node in htmlNodes)
{
Console.WriteLine(node.InnerHtml);
}
Ouput
This is <b>C#, ASP.Net</b> paragraph
This is <b>HTML Agility Pack</b> sample
Inner Text
This method is also a public one and returns string if you are going to access it for getting the contents. The
InnerText in html agility pack is your choice if all you want is just the text between the opening and closing tags of the desired HTML object. You could get the text present within the elements and thus is an easy task for you to perform the read operation dynamically. This method is also a part of the member of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>
.Net Core</h1>
This is <b>C#, ASP.Net</b> paragraph
<h1>
.Net Core with Angular</h1>
This is <b>HTML Agility Pack</b> sample
</body>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/p");
foreach (var node in htmlNodes)
{
Console.WriteLine(node.InnerText);
}
Output
This is C#, ASP.Net paragraph
This is HTML Agility Pack sample
Outer Html
This method lets you to get the object as well as the contents inside the one that you have mentioned to it. Seemingly, it could have a resemblance with the innerHTML but there is quite a big difference when using the
OuterHtml in html agility pack as with the OuterHTML you have straightaway access to the HTML object. Again, this method is a public one and returns the output in the form of a string. Needless to say, it is a part of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>.Net Core</h1>
<p>This is <b>C#, ASP.Net</b> paragraph</p>
<h1>.Net Core with Angular</h1>
<p>This is <b>HTML Agility Pack</b> sample</p>
</body>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//body/p");
foreach (var node in htmlNodes)
{
Console.WriteLine(node.OuterHtml);
}
Output
<h1>.Net Core</h1>
<h1>.Net Core with Angular</h1>
Parent Node
We have yet another useful feature in the form of
ParentNode in html agility pack where we can obtain the handle of the parent node of the mentioned HTML object. Few times, it is necessary to know the parent node and this method fits into the right category of use. It returns the parent node and hence the method has the return type as HtmlNode. Thus, one can finally conclude that even this method is also a part of the HtmlAgilityPack.HtmlNode.
var html =
@"<body>
<h1>.Net Core</h1>
<p>This is <b>C#, ASP.Net</b> paragraph</p>
<h1>.Net Core with Angular</h1>
<p>This is <b>HTML Agility Pack</b> sample</p>
</body>";
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(html);
var node = htmlDoc.DocumentNode.SelectSingleNode("//body/h1");
HtmlNode parentNode = node.ParentNode;
Console.WriteLine(parentNode.Name);
Output
body
Post A Comment:
0 comments: