TalkPHP

TalkPHP (http://www.talkphp.com/forums.php)
-   General (http://www.talkphp.com/general/)
-   -   Text from Word-document (http://www.talkphp.com/general/3944-text-word-document.html)

Tanax 02-05-2009 10:52 PM

Text from Word-document
 
Hi!

I run the website for our neighbourghhood, and sometimes they have protocols that I need to publish on our website.

The problem is that when they write the protocol, they use Word.
If it's just text, it's working great to just copy, and paste it into the newsmanager I've coded.

However, sometimes they've used bold stuff, which is quite easy to just use <strong>. The real problem is when they use tables.

So I'm wondering if PHP can access a worddocument's properties and copy the text from it(with a custom function perhaps?), and if it locates a table, write out a html table that matches it? Bold text to be strong, and italic text as it is.


Thanks in advance!
Tanax

Scottymeuk 02-05-2009 11:54 PM

You could use a JavaScript editor like TinyMCE. That will allow you to keep all of the formatting when copied.

sketchMedia 02-06-2009 12:00 AM

Hmm, I suppose you could cobble something together with COM.

Tanax 02-06-2009 12:06 AM

Quote:

Originally Posted by Scottymeuk (Post 21627)
You could use a JavaScript editor like TinyMCE. That will allow you to keep all of the formatting when copied.

Yes, I've thought about that. But that doesn't support tables, does it?

Quote:

Originally Posted by sketchMedia (Post 21628)
Hmm, I suppose you could cobble something together with COM.

What's COM?

Wildhoney 02-06-2009 12:11 AM

How are their tables typically laid out in the text files?

sketchMedia 02-06-2009 12:57 AM

http://www.microsoft.com/com/default.mspx

You could do something like this, however I believe you will need word installed on your server!(help me out windows guys) which may or may not be possible, any who this is how you would do it (I think :-/).
PHP Code:

<?php
try
{
    
$sDoc realpath('word.doc');
    
$aPath pathinfo($doc);
    
$sName $aPath['filename'];

    
$pWordCOM = new COM("word.application");
    
$pWordCOM->Documents->Open('originalDoc.doc');
    
$pWordCOM->ActiveDocument->SaveAs($sName8); //8 is the number for htm ext
    
$pWordCOM->ActiveDocument->Close(false);
    
$pWordCOM->Quit();
    unset(
$pWordCOM);
}
catch(
Exception $e)
{
?>
    <h4>Error: </h4>
    <p><?php echo $e->getMessage(); ?></p>
    <br />
    <h4>Stack Trace</h4>
    <pre>
    <?php echo $e->getTraceAsString(); ?>
    </pre>
<?php
}

Basically uses COM to open word, then open the document, then saveAs htm.

Not tested btw.

Tanax 02-06-2009 01:44 AM

Quote:

Originally Posted by Wildhoney (Post 21632)
How are their tables typically laid out in the text files?

Like a basic table in word? There's a table tool in word that allows you to make a table.. Like that

Quote:

Originally Posted by sketchMedia (Post 21634)
http://www.microsoft.com/com/default.mspx

You could do something like this, however I believe you will need word installed on your server!(help me out windows guys) which may or may not be possible, any who this is how you would do it (I think :-/).
PHP Code:

<?php
try
{
    
$sDoc realpath('word.doc');
    
$aPath pathinfo($doc);
    
$sName $aPath['filename'];

    
$pWordCOM = new COM("word.application");
    
$pWordCOM->Documents->Open('originalDoc.doc');
    
$pWordCOM->ActiveDocument->SaveAs($sName8); //8 is the number for htm ext
    
$pWordCOM->ActiveDocument->Close(false);
    
$pWordCOM->Quit();
    unset(
$pWordCOM);
}
catch(
Exception $e)
{
?>
    <h4>Error: </h4>
    <p><?php echo $e->getMessage(); ?></p>
    <br />
    <h4>Stack Trace</h4>
    <pre>
    <?php echo $e->getTraceAsString(); ?>
    </pre>
<?php
}

Basically uses COM to open word, then open the document, then saveAs htm.

Not tested btw.

Based off of what I read on php.net about COM, that looks like it could work. However, I didn't find any info about saveAs function, and about the integer you pass with it either.. Where did you find that information?

Thanks!

Wildhoney 02-06-2009 01:55 AM

That works SketchMedia, with a little bit of re-working. I used the following code and it output a txt document for me, with a folder as well with 2 other files in:

php Code:
try
{
    $szOpenDocument = 'C:/wamp/www/word/word-open.doc';
    $szSaveDocument = 'C:/wamp/www/word/word-save.txt';

    $pWordCOM = new COM("word.application");
    $pWordCOM->Documents->Open($szOpenDocument);
    $pWordCOM->ActiveDocument->SaveAs($szSaveDocument, 8);
    $pWordCOM->ActiveDocument->Close(false);
    $pWordCOM->Quit();
    unset($pWordCOM);
}
catch(Exception $e)
{
    die($e->getMessage());
}

Tanax 02-06-2009 02:07 AM

Quote:

Originally Posted by Wildhoney (Post 21638)
That works SketchMedia, with a little bit of re-working. I used the following code and it output a txt document for me, with a folder as well with 2 other files in:

php Code:
try
{
    $szOpenDocument = 'C:/wamp/www/word/word-open.doc';
    $szSaveDocument = 'C:/wamp/www/word/word-save.txt';

    $pWordCOM = new COM("word.application");
    $pWordCOM->Documents->Open($szOpenDocument);
    $pWordCOM->ActiveDocument->SaveAs($szSaveDocument, 8);
    $pWordCOM->ActiveDocument->Close(false);
    $pWordCOM->Quit();
    unset($pWordCOM);
}
catch(Exception $e)
{
    die($e->getMessage());
}

Thanks!

Just a question though.
Where do you find info about the Documents and Open thingy?
As used here:
PHP Code:

$pWordCOM->Documents->Open($szOpenDocument); 

Also, these things?
PHP Code:

$pWordCOM->ActiveDocument->SaveAs($szSaveDocument8); 

As said before, I found the info about the COM class, but it didn't have any of those functions.. ActiveDocument nor Documents..

Anyways, thanks! However, I didn't really want to save it as a txt.. I wanted to get the HTML code so I can post it on the website, without having to write the HTML code for bold and tables wherever they are used..

sketchMedia 02-06-2009 02:50 AM

Quote:

Originally Posted by Tanax (Post 21639)
Thanks!

Just a question though.
Where do you find info about the Documents and Open thingy?
As used here:
PHP Code:

$pWordCOM->Documents->Open($szOpenDocument); 

Also, these things?
PHP Code:

$pWordCOM->ActiveDocument->SaveAs($szSaveDocument8); 

As said before, I found the info about the COM class, but it didn't have any of those functions.. ActiveDocument nor Documents..

Anyways, thanks! However, I didn't really want to save it as a txt.. I wanted to get the HTML code so I can post it on the website, without having to write the HTML code for bold and tables wherever they are used..

Alot of searching on MSDN (hateful site).

http://msdn.microsoft.com/en-us/library/bb216319.aspx
http://msdn.microsoft.com/en-us/library/bb221597.aspx
http://msdn.microsoft.com/en-us/library/bb238158.aspx

It should save it in html format, theoretically.

CoryMathews 02-06-2009 03:09 AM

I have been looking for a way to do this on a linux shared server, meaning no word to open. Anyone got a solution for that? I had to ask..

Tanax 02-06-2009 12:11 PM

Quote:

Originally Posted by sketchMedia (Post 21641)

Thanks alot!
And yes, I hate that site.. :S

But thanks, I'll try it!

maZtah 02-06-2009 12:33 PM

Why don't you code an option so that a user can upload files with a newsitem for example? That way the user just can upload the Word document to the server (and visitors can download the document).

Tanax 02-06-2009 12:39 PM

Because these so called "users" are 60 years old, and don't know how to use a computer, nontheless upload files xD only thing they know is Word, so I have to do all the news publishing. However, yes, I could allow the visitors from our neighbourghhood to download the word document, but better to actually be able to publish something, so that's why :-)


All times are GMT. The time now is 10:15 PM.

Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.1.0