Learn HAP: How to use XPath using HTML Agility Pack?

Introduction

XPath refers to XML Path Language which can be put into action to navigate through specific attributes and elements in an HTML or XML document. XPath is an XSLT standard element that is recommended by W3C and it uses "path like" syntax to recognize and navigate single document nodes in an XML document. Before go through this article, you may check previous article on what is HTML agility pack to rewind the few things about.

How to use XPath using HTML Agility Pack

Also, XPath is a path expression that contains more than 220 built-in functions. By using path expressions, it selects single nodes or node-sets in an XML document to extract text from the linked Html path using XpathByHtmlAgility() method.

Free Video Tutorial to Learn XPath using HTML Agility Pack

Components in XPath

The functions for numeric values, sequence handling, string values, booleans, node handling, date and time comparison, and much more are available. Now these days, XPath expressions can also be
integrated in JavaScript, XML Schema, Java, PHP, Python, C and C++, and lots of other languages.

XPath 1.0, XPath 2.0 and XPath 3.0 were the W3C Recommendations..

Using XPath with the HtmlDocument class

Here we are using for web scraping websites and extract information as per our requirements.

XPath Demo to ‘Extract text using XPath’

Step #1: Define object of HTMLWeb

HtmlWeb web = new HtmlWeb();

Step #2: Define object of HtmlDocument()

HtmlDocument doc = new HtmlDocument();

Step #3: Load Document to execute XPath statement

doc = web.Load("https://www.technologycrowds.com/2019/10/compute-sha-256-hash-using-csharp-for-effective-secruity.html");

Step #4: Now extracting text using XPath Statement

var _extractText = doc.DocumentNode.SelectSingleNode("/html/body/div[5]/div/div/div/div[1]/div/div/div[2]/div[1]/div[2]/article/div[2]/div/div[2]").InnerText;

Step #5: Final Method demonstrating XPath

// XPath Method
static void xPathByHTMLAgility()
{
 HtmlWeb web = new HtmlWeb();
 HtmlDocument doc = new HtmlDocument();

 doc = web.Load("https://www.technologycrowds.com/2019/10/compute-sha-256-hash-using-csharp-for-effective-secruity.html");
 var _extractText = doc.DocumentNode.SelectSingleNode("/html/body/div[5]/div/div/div/div[1]/div/div/div[2]/div[1]/div[2]/article/div[2]/div/div[2]").InnerText;
 Console.WriteLine(_extractText);
}

Step #6: Final Output using xPath

Conclusion

XPath is very important feature while we working on web scraping, data mining or extract text from website specific web page using Html Agility pack (free video libary) method. For more information, get in touch with us and navigate to our free video libary.

Pages

Technology Crowds

Labels

slider

Recent

Navigation

Learn HAP: How to use XPath using HTML Agility Pack?

Introduction

Free Video Tutorial to Learn XPath using HTML Agility Pack

Components in XPath

XPath 1.0, XPath 2.0 and XPath 3.0 were the W3C Recommendations..

XPath Demo to ‘Extract text using XPath’

Step #1: Define object of HTMLWeb

Step #2: Define object of HtmlDocument()

Step #3: Load Document to execute XPath statement

Step #4: Now extracting text using XPath Statement

Step #5: Final Method demonstrating XPath

Step #6: Final Output using xPath

Conclusion

Relevant Reading

Anjan kant

Post A Comment:

0 comments:

Recent Post

Popular Posts

Random Post

Why Enterprise Application Development Services Are Essential for Business Growth

Pages

Labels

slider

Recent

Navigation

Introduction

Free Video Tutorial to Learn XPath using HTML Agility Pack

Components in XPath

XPath 1.0, XPath 2.0 and XPath 3.0 were the W3C Recommendations..

XPath Demo to ‘Extract text using XPath’

Step #1: Define object of HTMLWeb

Step #2: Define object of HtmlDocument()

Step #3: Load Document to execute XPath statement

Step #4: Now extracting text using XPath Statement

Step #5: Final Method demonstrating XPath

Step #6: Final Output using xPath

Conclusion

Relevant Reading

Next

Newer Post

Previous

Older Post

Anjan kant

Post A Comment:

0 comments: