Hex Dump any URL - Screen Scrape Viewer
page 1 of 1
Published: 01 Nov 2003
Unedited - Community Contributed
Illustrates basic techniques for screen scraping and provides code for a viewer that will show the code for any site in hex, html, or browser view.
by Steve Sharrock
Average Rating: This article has not yet been rated.
Views (Total / Last 10 Days): 12856/ 9

This example illustrates how to create a screen scrape viewer that will display the selected URL in several formats, including a hex dump. I remember reading an old Petzold book (perhaps the 1st Windows book) where he said the first program he wrote for any new platform was a "file" hex dump utility to help learn the platform and provide a basic debugging tool. I've updated this here to use internet "screen-scraping" rather than just the local file system. I use this when "View Source" from the browser doesn't give me quite enough detail, particularly with the newline codes, tabs and other special characters I may need to navigate when parsing the data on the page.

The ASPX page includes a TextBox into which the user enters the specified URL. In addition to a Submit button, there are three radio buttons to choose the display format: Hex, HTML(Ascii), Web. As you will see later in the code, that later option (Web) opens a new browser window to display the specified URL by generated javascript from the codebehind.  The other items on the page are the output display TextBox and an Error Label control. The Error Label is only visible if there is a problem.

  <FORM id="HexDump" method="post" runat=
  "server"> <asp:textbox id="UrlCtrl" runat="server"
    HEIGHT="28px" WIDTH= "701px"></asp:textbox> <BR> <FONT
    "-2"><B>Enter the url like http://www.microsoft.com</B></FONT> <BR>
    type="submit" value= "Submit">   
    <asp:radiobutton id="HexBtn" runat="server" AUTOPOSTBACK="True"
      Checked="True" GroupName="DisplayType" Text="Hex"></asp:radiobutton>
    <asp:radiobutton id="AsciiBtn" runat="server" AUTOPOSTBACK="True"
      GroupName="DisplayType" Text="HTML(Ascii)"></asp:radiobutton>
    <asp:radiobutton id="WebBtn" runat="server" AUTOPOSTBACK="True"
      GroupName="DisplayType" Text="Web"></asp:radiobutton>
    <asp:label id="ErrorLbl" runat="server" Width="822px" Height="37px"
    <asp:textbox id="DisplayCtrl" runat="server" Width="95%"
      Height="300px" TEXTMODE="MultiLine"></asp:textbox>

In the code below you see that everything happens within the Page_Load method during a postback. In addition to the Submit button, each of the radio buttons also cause a postback and subsequent reload of the page displaying the output in the requested format.

First the URL is checked for empty and then "http://" is added for a "www" reference. If the "Web" format radio button state is checked, I generate some javascript to launch a new browser window with the specified URL. This allows me to continue to view the page from which I am currently inspecting the hexadecimal output.

Thanks to the WebClient class, the actual screen-scrape is performed in the two lines of code within the "try" block. If this is successful, the text is displayed in either its normal HTML format, or in its hexadecimal representation.

private void Page_Load(object sender, System.EventArgs e)
  ErrorLbl.Visible = false;
  if ( this.IsPostBack )
    string text;
    string url;
    url = UrlCtrl.Text.Trim();
    if ( url.Length == 0 )
      SetError("You need to enter an URL");
    if ( url.Length > 3 && url.Substring(0,3).ToLower() == "www" )
      url = "http://" + url;

    if ( WebBtn.Checked )
    { // simply provide IFRAME src
        "<script language='javascript'>" +
          "window.open('" + url + "','WebView')</script>");
      byte[] bytes = new WebClient().DownloadData( url );
      text = new UTF8Encoding().GetString( bytes );
    catch( Exception ex )
      SetError( ex.Message );
    if ( HexBtn.Checked )
      DisplayCtrl.Text = GetHex( text );
    else DisplayCtrl.Text = text;

The GetHex function takes the input string and converts it to the formatted display text with 16 hex characters on the left followed by their display character values on the right. I didn't spend a lot of time here, but I wanted to show one of the basic uses of the StringBuilder class; that is the Append method. For those of us with an MFC background, we tend to expect the StringBuilder methods to be part of the basic String class. Since strings are immutable, however (as they should be) we use the StringBuilder class instead. It is well worth studying this class to learn all that it can do.

public static string GetHex( string txt )        
  int i;
  // disp is the 16-bytes of display
  StringBuilder disp = new StringBuilder();
  // hex is the complete output (hex+disp)
  StringBuilder hex = new StringBuilder();

  for( i = 0; i < txt.Length; i++ )
    if ( i > 0 )
      if ( i % 16 == 0 )
        if ( hex.Length > 0 )
        { // end current line
          hex.Append( "  " + disp.ToString() + "\r\n" );
          disp.Length = 0;
        if ( i % 8 == 0 )
          hex.Append("- ");
    hex.Append( string.Format("{0:x2} ", (int)txt[i] ) );
    if ( txt[i] >= ' ' && txt[i] <= 127 )
      disp.Append( txt[i] );
    else disp.Append( '.' );
  // end of text - make sure we end the last line of hex
  if ( disp.Length > 0 )
    if ( disp.Length < 16 )
      if ( disp.Length < 8 )
        hex.Append("  ");
      for( i = disp.Length; i < 16; i++ )
        hex.Append("   ");
    hex.Append( "  " + disp.ToString() );
  return hex.ToString();
} // end GetHex( txt )

private void SetError( string err )
  ErrorLbl.Visible = true;
  ErrorLbl.Text = err;


I've thrown the GetHex method into my utilities toolbox that I use for general debugging. I also use this with a WinForm application for viewing local files. I can also use this web version to look at files on the server using "file://" instead of "http://". My next effort is going to be to figure out how to use "ftp://".


You can download HexDump.zip if you want to use these two files and don't like to type.

User Comments

Title: Help   
Name: Bala
Date: 2009-01-20 1:50:25 AM
Updated Email Id bshirsat@myuberall.com
Title: Thanks   
Name: Balasaheb Shirsath
Date: 2008-06-06 1:20:10 AM
It is Very Very Good. Now if change page in HTML Mode then i want to refelect it. bshirsat@myuberal.com

Community Advice: ASP | SQL | XML | Regular Expressions | Windows

©Copyright 1998-2024 ASPAlliance.com  |  Page Processed at 2024-04-13 8:44:44 PM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search