PixieWare Software

We'll connect you easier, faster and more reliably to data

PixieRobot Introduction

PixieRobot has several automated and unattended functions, which can be run interactively or in a scheduled mode.

  • Processing "Web Pages" - Used to gather data from the web and store on a local machine for subsequent processing. For example storing in a database, or emailing to another system.
  • Processing "Email File Transfers" - Used to send/receive emails and attachments, containing data files, between systems. Attachments could contain such things as database updates, web extracts, or source code. Application systems can drop-off or receive files from the PixierRobot message folders.
  • PixieRobot in combination with PixieWeb can be used to move data in and out of PICK systems. An example script can be found here: PixieRobot PICK extract script.
  • PLUS - We can write extract scripts for you. The scripts are yours to keep so you can run as many times as required. We charge per script, NOT per transaction extracted.
  • Click to view a simple 9 line PixieRobot script.

    This page is a focal point for PixieRobot as a web robot, which is a program that visits web-sites, reads their pages, inspects the HTML, processes instructions or rules written in scripts, extracts data, and saves the data in another more structured format. The robot only visits the pages needed, making the robot processes more efficient and productive as only targeted data is collected, in an off-line and unattended mode. PixieRobot harnesses the full power of Internet Explorer (IE), so that it is impossible for a web server to distinguish between it and a human user.

    Data found on the WWW is an important resource. Once it is published, it is available to many people and organisations, and may be interacted with using a wide range of tools (e.g. a browser). PixieRobot is another tool for utilising Web pages as a source of data. It is used to intelligently and accurately "browse" the HTML content on standard Web pages.

    Web browsing is the process of opening a Web page via the Internet and displaying the data from that Web page. The data is visually clear to any person looking at the Web page, but it is buried within the underlying HTML code which is a mixture of the presentation rules and data. The purpose of PixieRobot is to capture such page contents and to make them available to your "script" program to intelligently decipher the embedded data. PixieRobot also enhances the basic VBSCRIPT language with a library of functions making it easier to identify the structure elements of a Web page. You can read them and also manipulate them to carry on the next step of a conversation with that web site.

    Major Benefits of PixieRobot are:

  • Reducing people costs by using specialised scripts
  • Speeding up the process by automating the task
  • Eliminating errors and improving accuracy of the extracted data
  • Distinctive features of PixieRobot are:

  • Automated and unattended
  • Navigation to web pages and drilling down
  • Filling-in of forms and submission of them
  • Capture of pictures and data
  • Output of the picture and data for other uses
  • Detailed Features

    Possible Applications

    General Uses
    More Specific Uses

    How PixieRobot Works

    PixieRobot runs the specified script and executes a special function (ExecuteWWW) to retrieve the HTML associated with a selected Web page. The HTML is returned as a text string ("s"), which may be searched for the pre-pattern and post-pattern sub-strings surrounding the desired data. The text between the two patterns is the data to be extracted. The extracted data may itself be manipulated (e.g. convert the text to a numeric value), and then written to another file. If default file is an .XLS file then the extracted raw HTML may be fed directly into a spread-sheet and automatically formatted. This .XLS technique may be used to simplify the process of defining delimitered pattern searches (e.g. no need to find and remove <td> tags). 

    Data from a source transaction file may be read and inserted into the data elements on a Web page form. Thus automating a Web page update (submit) process (e.g. submit product sales or purchase orders into another system, or submit a product for an auction).

    The HTML code patterns must be analysed and encapsulated in a PixieRobot script prior to the price data being extracted. For example the following VBSCRIPT scans the raw HTML to find the string "Prices are..." and then searches for the <table> and </table> tags that follow. The data between the tags is extracted for subsequent use (e.g. stored in an MS_Excel table).  (Click here to view full VBSCRIPT and comments)

    Sub Main
    Monitor = True
    Silent = True
    s = ExecuteWWW("http://www.pixieware.com/totprices.htm")
    iPos = Instr(1, s, "Prices are subject", 1)
    iPos = Instr(iPos, s, "<table ", 1)
    s = Mid(s, iPos)
    iPos = Instr(1, s, "</table", 1)
    s = Left(s, iPos+8)
    Call OutputToFile(s, "prices.xls")
    End Sub

    The extracted string (found in variable "s") can be placed in the nominated output file or emailed to a recipient mail-box. An .xls file extension will cause MS_Excel to read the extracted data and HTML and format the data into columns and rows in the same manner as a browser.

    When running interactively, windows appear on the desk top to control PixieRobot's behavior, for example, change the script to be run. PixieRobot indicates when it has started and finished, and it displays the selected URL to verify that it is available for processing, thus providing a simple check that the script is working. 

    "If PixieRobot is used to extract (scrape) data from web-sites, we expect all users of PixieRobot to read and comply with that sites existing legal (patent, copyright, trademark) and other intellectual property law assertions made by the site owners, in respect of their web-site contents usage." This message does not constitute legal advice and is not a substitute for the professional judgment of an attorney should you need assistance.

    We'll connect you easier, faster and more reliably to data

    Email: sales@pixieware.com

    Copyright © 1999 - - PixieWare Software