Parsing improvements with MPP
Spider has recently been setup to observe Mobile Payment Platform of Flowbird.
It brought its set of surprises and 'not managed' cases, leading to improvements in parsing :)
But new users really like it!!!
Quality of parsingโ
Quality of parsing is great. Spider often show no on issue over millions of communications.
But on MPP it was different, there were errors all the time.
After analysis and improvements I found out that:
- Parallel parsing of the same TCP session could occur and was not managed right in status reporting
- I added version management of parsing log
- I added some constraints on parsing status collect
So that, quality is still the same, but quality reporting is better :)
- PHP is not following HTTP standard
- Some response have no length, no chunked encoding.
- Some chunked response bodies... don't end with the last '0' chunk.
- Expect 100 continue may be asked while sending the body in the same packet!
For the 2 first, it is hard to tell if parsing is .. finished.
And as it does not follow standard, I did not try to understand this.
-> Poor quality line when, in fact, everything is well parsed.
100 continue special caseโ
For the last point, I had to do a hack in parsing process to cut and inject duplicated packets to respect the standard. But I managed :-)
See:
HTTP communication with expect, but body inline, and 2 responses at once:
Spider parsing :