After 3 months gestation and some bug fixes, HtmlSanitizer is reporting no hacking successes.
Does it mean that it rocks? I don’t think so, but it is probably strong enough to sail in stormy waters.
See my previous post to know how it works, but mainly test it online with the Patapage playground.
Being honest I received some complaints concerning the black list approach to CSS styles, but no one has hacked the current version (yet 🙂 ). In any case the code is open to changes, and I’m happy to receive your feedbacks.
Now we are proud to announce that there is a porting to C# by Beyers Cronje (thank you). You can find C# sources here (and Java ones here). Warning: source code is already patched as suggested by Isaiah.
Other portings are welcome!
Ciao Roberto, complimenti per il codice, volevo segnalarti una cosa … ho provato a copiare un testo proveniente da word (altro annoso problema) e ho notato che il codice lasciava un tag di chiusura del tipo o:p.
Saluti
I found a couple bugs, present in both the java and C# versions:
1. Self-closed tags were being converted to a pair of tags;
test case: <param/><param/> becomes <param><param></param></param>
2. Incorrect index in the replaceAllNoRegex function;
buffer.Append(source.Substring(oldPos, pos));
should be
buffer.Append(source.Substring(oldPos, search.Length));
Here is a patch for the C#:
Correction: Bug #2 is ONLY in the C# code. The substring function in C# takes params start position, length, versus the one in java, which takes start position, end position.
Another correction (sorry):
buffer.Append(source.Substring(oldPos, search.Length)); is wrong, it should read
buffer.Append(source.Substring(oldPos, pos – oldPos));
Thanks for the fixes Isaiah.
Thanks a lot for sharing your code.
Please update the C# source file as fixed by Isaiah.
Done.
thanks for remind me it!
It is really nice but it kinda kills relative urls in img tags, src gets “killed”.
A fantastic blog post, I just psased this onto a university student who was doing a little research on this. And he in fact bought me lunch because I found it for him smile.. So let me reword that: Thank you for the treat! But yeah Thnkx for taking the time to talk about this, I feel strongly about it and enjoy reading more on this topic. If possible, as you gain expertise, would you mind updating your blog with more details? It is extremely helpful for me. Big thumb up for this share!
not working well
You can do it better….