Anyways suppose I have an IRC chat log where each time someone says something a line in this form is added:
[hh:mm] <'handle'> 'this is what they said'
what I want to do is extract each person's name and what they typed so I'm using regex like so:
m = re.search("\[\d{2}:\d{2}]\s+<(.+?)>\s(.+?)\n", dtext)dtext is where the chat log is stored.
The pattern matches a maximum of 2 groups at a line. It will have an even number of groups no matter what. The first one will be the username and second one will be whatever they said. What I want to do is count each person's words/line. And by line I mean everything from when they start typing to when they submit. So it's not a literal "line".
I figured the most efficient way to do this is make a "type"(I don't know what the equivalent of this is in python) called "Line", for example, with the attributes "Name" and "Words". I can iterate through the chatlog text file line by line each line that matches will have a name and whatever-they-said to go with it which I can add to an overall array and do the math later on.
So really the part I need help with is
1. creating a "type"
2. reading a file line by line (i already know how to open one)
3. matching regex pattern against that line