At the moment we use openoffice 3.3 and Jodconverter 2.2 to covert office documents to pdfs and swftools convert pdf intern to swf which is played in course player. (https://wiki.exphosted.com/doku.php/content_management)
It is observed that some documents are not displayed correctly due to the lack of support by openoffice for some of the MSoffice 2007+ properties. We should try to improve this.
1) It is observed that libreoffice 4.1.2 handles MSoffice 2007+ documents better than openoffice 4.0.
2) We can stay with jodconverter 2.2 as jodconverter 3.0 starts it's own process for each conversion which slows down and also 3.0 does not support running conversions on remote machines.
3) With libreoffice there is a good improvement in the MSoffice 2007 document conversion but still there are issues. This can be further improved by passing a filter option to convert to the document to pdf1a version. Pdf1a is the standard to keep the document display same across devices. It embeds fonts and external objects into the pdf to make sure it is displayed the same. But passing this option seems to slow down the document conversion process.
With Jodconverter 2.2 we can pass an argument that has sample registry xml file with the filters needed. Also, office home can be passed as an argument.
java -Doffice.home=/opt/libreoffice4.1 -jar #{RAILS_ROOT}/vendor/jodconverter-2.2.2/lib/jodconverter-cli-2.2.2.jar -x document-formats.xml src.pptx dst.pdf
Settings to convert to pdf1a and compress images (in document-formats.xml)
<entry>
<string>FilterData</string>
<!--<map>
<entry>
<string>UseTaggedPDF</string>
<boolean>true</boolean>
</entry>
</map>-->
<map>
<entry>
<string>SelectPdfVersion</string>
<int>1</int>
</entry>
</map>
<map>
<entry>
<string>ReduceImageResolution</string>
<boolean>true</boolean>
</entry>
</map>
<map>
<entry>
<string>MaxImageResolution</string>
<int>400</int>
</entry>
</map>
</entry>
</entry>
4) It is observed that on a Ubuntu machine libreoffice 4 seems to handle pptx docs much better than the centos machines we are using for our environments. The versions and configs look same on both machines but not sure if any system library is the reason for this behaviour. This needs futhur investigation.
Either figure out the way to furthur improve MSoffice 2007 + doc conversions with pdf 0 version on Centos or way to improve the speed of conversion with pdf1a option using libreoffice.
1. Download libreoffice 4.1.2 and swftools 0.9.2
wget http://download.documentfoundation.org/libreoffice/stable/4.1.2/rpm/x86_64/LibreOffice_4.1.2_Linux_x86-64_rpm.tar.gz wget http://www.swftools.org/swftools-0.9.2.tar.gz
2. Install swftools
tar xzfv swftools-0.9.2.tar.gz cd swftools-0.9.2 ./configure make sudo make install #verify the verion pdf2swf --version #pdf2swf - part of swftools 0.9.2
2. Stop God. Replace openoffice in god script file with office.
/deploy/systasks/god.sh stop vi /deploy/systasks/god.sh #replace openoffice with office.
3. Remove openoffice and install libreoffice
#remove existing office installations yum remove openoffice* libreoffice* tar -xvf LibreOffice_4.1.2_Linux_x86-64_rpm.tar.gz cd /tmp/LibreOffice_4.1.2_Linux_x86-64_rpm/RPMS/ yum localinstall *.rpm
4. Open development.rb and add doc conversion related settings
vi /deploy/crossbow/shared/config/development.rb #Add the following lines # Document conversion related settings OFFICE_HOME = "/opt/libreoffice4.1" JOD_CONVERTER_USE_REGISTRY_FILE = false PDF2SWF_ENABLE_POLY2BITMAP = false
5. Change god related configuration.
vi /deploy/crossbow/shared/config/development.god #Change the following lines # Line no - 3 OFFICE_PATH = '/opt/libreoffice4.1/program" # Change openoffice to office in whole document.
1. Download libreoffice 4.1.2 and swftools 0.9.2
wget http://download.documentfoundation.org/libreoffice/stable/4.1.2/rpm/x86_64/LibreOffice_4.1.2_Linux_x86-64_rpm.tar.gz wget http://www.swftools.org/swftools-0.9.2.tar.gz
2. Install swftools
tar xzfv swftools-0.9.2.tar.gz cd swftools-0.9.2 ./configure make sudo make install #verify the verion pdf2swf --version #pdf2swf - part of swftools 0.9.2
2. Stop God. Replace openoffice in god script file with office.
/deploy/systasks/god.sh stop vi /deploy/systasks/god.sh #replace openoffice with office.
3. Remove openoffice and install libreoffice
#remove existing office installations yum remove openoffice* libreoffice* tar -xvf LibreOffice_4.1.2_Linux_x86-64_rpm.tar.gz cd /tmp/LibreOffice_4.1.2_Linux_x86-64_rpm/RPMS/ yum localinstall *.rpm
4. Open staging.rb and add doc conversion related settings
vi /deploy/crossbow/shared/config/staging.rb #Add the following lines # Document conversion related settings OFFICE_HOME = "/opt/libreoffice4.1" JOD_CONVERTER_USE_REGISTRY_FILE = false PDF2SWF_ENABLE_POLY2BITMAP = false
5. Change god related configuration.
vi /deploy/crossbow/shared/config/staging.god # Increase office memory limit to 800.mb #Change the following lines # Line no - 3 OFFICE_PATH = '/opt/libreoffice4.1/program" # Change openoffice to office in whole document.
1. Download libreoffice 4.1.2 and swftools 0.9.2
wget http://download.documentfoundation.org/libreoffice/stable/4.1.2/rpm/x86_64/LibreOffice_4.1.2_Linux_x86-64_rpm.tar.gz wget http://www.swftools.org/swftools-0.9.2.tar.gz
2. Install swftools
tar xzfv swftools-0.9.2.tar.gz cd swftools-0.9.2 ./configure make sudo make install #verify the verion pdf2swf --version #pdf2swf - part of swftools 0.9.2
2. Stop God. Replace openoffice in god script file with office.
/deploy/systasks/god.sh stop vi /deploy/systasks/god.sh #replace openoffice with office.
3. Remove openoffice and install libreoffice
#remove existing office installations yum remove openoffice* libreoffice* tar -xvf LibreOffice_4.1.2_Linux_x86-64_rpm.tar.gz cd /tmp/LibreOffice_4.1.2_Linux_x86-64_rpm/RPMS/ yum localinstall *.rpm
4. Open production.rb and add doc conversion related settings
vi /deploy/crossbow/shared/config/production.rb #Add the following lines # Document conversion related settings OFFICE_HOME = "/opt/libreoffice4.1" JOD_CONVERTER_USE_REGISTRY_FILE = false PDF2SWF_ENABLE_POLY2BITMAP = false
5. Change god related configuration.
vi /deploy/crossbow/shared/config/production.god
#Change the following lines
# Increase office memory limit to 1000.mb
# Line no - 3
OFFICE_PATH = "/opt/openoffice4/program"
# Change openoffice to office in whole document.
vi /deploy/crossbow/shared/generic_monitoring.god
#replace open office watch code with this.
God.watch do |w|
script = "#{OFFICE_PATH}/soffice.bin --headless --accept=\"socket,host=127.0.0.1,port=8100;urp;\" --nofirststartwizard --norestore --nologo --nodefault"
w.name = "office"
w.group = "crossbow"
w.interval = 60.seconds
w.start = "#{script}"
w.stop = "kill `cat #{RAILS_ROOT}/tmp/pids/soffice.pid`"
w.start_grace = 20.seconds
w.restart_grace = 20.seconds
w.behavior(:clean_pid_file)
generic_monitoring(w, :cpu_limit => PROCESS_SETTINGS[:office][:cpu], :memory_limit => PROCESS_SETTINGS[:office][:memory])
end
1. Download openoffice 4.0.1 and swftools 0.9.2
wget http://jaist.dl.sourceforge.net/project/openofficeorg.mirror/4.0.1/binaries/en-US/Apache_OpenOffice_4.0.1_Linux_x86-64_install-rpm_en-US.tar.gz wget http://www.swftools.org/swftools-0.9.2.tar.gz
2. Install swftools
tar xzfv swftools-0.9.2.tar.gz cd swftools-0.9.2 ./configure make sudo make install #verify the verion pdf2swf --version #pdf2swf - part of swftools 0.9.2
2. Stop God. Replace openoffice in god script file with office.
/deploy/systasks/god.sh stop vi /deploy/systasks/god.sh #replace openoffice with office.
3. Remove openoffice and libreoffice. Install openoffice 4.0.1
#remove existing office installations yum remove openoffice* libreoffice* tar -xvf Apache_OpenOffice_4.0.1_Linux_x86-64_install-rpm_en-US.tar.gz cd en-US/RPMS/ yum localinstall *.rpm desktop-integration/openoffice4.0-redhat-*.rpm (desktop integration packages are not needed but it is fine to have them either.)
4. Open production.rb and add doc conversion related settings
vi /deploy/crossbow/shared/config/production.rb #Add the following lines # Document conversion related settings OFFICE_HOME = "/opt/openoffice4" JOD_CONVERTER_USE_REGISTRY_FILE = false PDF2SWF_ENABLE_POLY2BITMAP = false
5. Change god related configuration.
vi /deploy/crossbow/shared/config/production.god
#Change the following lines
# Increase office memory limit to 1000.mb
# Line no - 3
OFFICE_PATH = "/opt/openoffice4/program"
# Change openoffice to office in whole document.
vi /deploy/crossbow/shared/config/generic_monitoring.rb
#replace open office watch code with this.
God.watch do |w|
script = "#{OFFICE_PATH}/soffice.bin -headless -accept=\"socket,host=127.0.0.1,port=8100;urp;\" -nofirststartwizard -norestore -nodefault"
w.name = "office"
w.group = "crossbow"
w.interval = 60.seconds
w.start = "#{script}"
w.stop = "kill `cat #{RAILS_ROOT}/tmp/pids/soffice.pid`"
w.start_grace = 20.seconds
w.restart_grace = 20.seconds
w.behavior(:clean_pid_file)
generic_monitoring(w, :cpu_limit => PROCESS_SETTINGS[:office][:cpu], :memory_limit => PROCESS_SETTINGS[:office][:memory])
end