Convert RTF to Plain TEXT....Java Swing rescue

You want to convert RTF to plain text? Here's the code:


<cffunction name="getPlainTextFromRichText" access="public" returntype="string" output="false">
    <cfargument name="richText" type="string" required="true">
    <cfscript>
        var RTFEditorKit = CreateObject("java","javax.swing.text.rtf.RTFEditorKit").init();
        var styledDocument = CreateObject("java","javax.swing.text.DefaultStyledDocument").init();
        var reader = CreateObject("java","java.io.StringReader").init(arguments.richText);
        
        RTFEditorKit.read(reader,styledDocument,0);
        return styledDocument.getText(0,styledDocument.getLength());        
    
</cfscript>
</cffunction>

If you know a better solution, please let me know...

....and to strip HTML tags and let Java do the regex..:


<cffunction name="getPlainTextFromHTML" access="public" returntype="string" output="false">
    <cfargument name="html" type="string" required="true">
    <cfscript>
        var HTMLEditorKit = CreateObject("java","javax.swing.text.html.HTMLEditorKit").init();
        var styledDocument = CreateObject("java","javax.swing.text.html.HTMLDocument").init();
        var reader = CreateObject("java","java.io.StringReader").init(arguments.html);
        
        HTMLEditorKit.read(reader,styledDocument,0);
        return styledDocument.getText(0,styledDocument.getLength());        
    
</cfscript>
</cffunction>

Comments (Comment Moderation is enabled. Your comment will not appear until approved.)
ppshein's Gravatar Thanks. It might be surely helpful for me.
# Posted By ppshein | 1/12/11 7:00 AM
Dustin's Gravatar This is GREAT! I am building a resume import function for our system and dealing with the RTF and .DOC formatting codes was horrendous. I love the simplicity and speed of this solution for converting the read-in file to plain text for line-by-line parsing. The parsing algorithm is rather complex...but it wouldn't have worked at all without this tool. Many thanks!
# Posted By Dustin | 2/25/11 1:40 PM
Ernst van der Linden's Gravatar Thanks Dustin, you want to share your own parser with me?
# Posted By Ernst van der Linden | 2/28/11 9:27 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.9.004.