Author Topic: Extracting data from an html string  (Read 2745 times)

Using a TCPObject I can get the line
"<div class="message" id="a47c0"><span class="chatAvatar" src="http://forum.blockland.us/avatarUpload/avatar_num" /><b> <a href="http://forum.blockland.us/index.php?action=profile;u=profilenumber">Username</a></b>: Message</div>"

How would I go about extracting the username and the message forom this string. No I cannot use an HTMLObject because this requires POST instead of GET.

"<div class="message" id="a47c0"><span class="chatAvatar" src="http://forum.blockland.us/avatarUpload/avatar_num" /><b> <a href="http://forum.blockland.us/index.php?action=profile;u=profilenumber">Username</a></b>: Message</div>"
Code: [Select]
function parse(%string)
{
    for(%q=0;%q<getStrLen(%string);%q++)
    {
        %char=getSubStr(%string,%q,1);
        if(%char$=">")
            %start=%q;
        if(%char$="<"&&%start!$="")
        {
            %word=getSubStr(%string,%start,%q);
            %start="";
            return %word;
        }
    }
}

Something like this should return everything in between then >< signs, which is a space and the name/message, and a : according to what you got. Then just parse out the one : and ignore the first space. (I'm not sure if return breaks a for loop or not)

http://forum.blockland.us/index.php?topic=177446.0
this is a good resource for going through this kind of stuff.

I would just write something tailored to this situation. For example:

Code: [Select]
function stripBLData(%line) {
  %start = strPos(%line, "action=profile"); // Find an area close to the beginning of the username. We want to make sure we grab the right data, so starting closer to the actual data reduces possible incorrect grabbing.
  %start = strPos(%line, ">", %start) + 1; // Look for the first closing bracket that comes after action=profile. This is our start point to get data.
  %end = strPos(%line, "</a>", %start);  // Look for </a> after the start of the username. That's where the username ends.
  %username = getSubStr(%line, %start, %end); // Get the text between these two points. This is our username.

  %start = strPos(%line, "</b>: ", %end) + 6; // Find the point where our message starts.
  %end = strPos(%line, "</div>", %start); // Find the point where our message ends.
  %message = getSubStr(%line, %start, %end); // Get the data between these two points.
  return %username TAB %message; // Return the data.
}
« Last Edit: July 13, 2014, 02:27:13 AM by $trinick »

Don't you mean an HTTP Object, not an HTML object?

Don't you mean an HTTP Object, not an HTML object?
Oops, yes I did mean that

This will help with the downloading part if you need it: http://forum.blockland.us/index.php?topic=257317.0

This will help with the downloading part if you need it: http://forum.blockland.us/index.php?topic=257317.0
Thanks that will be helpful. I already made what I needed to make, however it is the most HORRIFIC code you could ever imagine. It works though!

Thanks that will be helpful. I already made what I needed to make, however it is the most HORRIFIC code you could ever imagine. It works though!

You wanna see horrific code? Then look at the ABC parser I'm working on XD

I made this if you want to look through the code. It basically watches the forums and notifies you of things.

http://elmshouse.weebly.com/blockland-streaming.html