docx4j 自定义格式支持(openxml直写)

fair_jm

浏览: 424034 次
性别:
来自: 杭州

最近访客更多访客>>

yishiyouya

jAmEs_

xyz86868

oyyl01

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

java菜鸟笔记

docx4j docx生成 openxml

这边是简单实现

在做的一个pptx转docx的东西

首先是用POI实现的发现ppt中取出的图像(XSLFPictureShape)放到docx中显示不出(文件在的解压docx看到了) 直接处理openxml不会....然后就用了docx4j

docx4j(http://www.docx4java.org/trac/docx4j)

将图片写入显示没有问题在处理好了表格之后要做自定义格式的处理(也就是ppt的格式到word里也要保留(最低限度的保留字体和字号总要的))

接下来用dom4j获取pptx的自定义格式的信息(此时项目已经是POI负责做pptx解析 docx4j负责做docx生成了)

然后用docx4j的自定义格式发现不管用:

http://blog.csdn.net/zhyh1986/article/details/8733389

其中的第一段代码改和没改输出一样

然后最后用了xml直接写入就可以了

先介绍一下openxml的简单格式(这里主要处理document.xml这个文档):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document
	xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas"
	xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
	xmlns:o="urn:schemas-microsoft-com:office:office"
	xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
	xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
	xmlns:v="urn:schemas-microsoft-com:vml"
	xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing"
	xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
	xmlns:w10="urn:schemas-microsoft-com:office:word"
	xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
	xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml"
	xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup"
	xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk"
	xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"
	xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape"
	mc:Ignorable="w14 wp14">
	<w:body>
		<w:p w:rsidR="003323F9" w:rsidRDefault="00713F7D" w:rsidP="00713F7D">
			<w:pPr>
				<w:jc w:val="center" />
				<w:rPr>
					<w:rFonts w:ascii="微软雅黑" w:eastAsia="微软雅黑" w:hAnsi="微软雅黑" />
					<w:b />
					<w:color w:val="000000" />
					<w:sz w:val="39" />
					<w:szCs w:val="39" />
				</w:rPr>
			</w:pPr>
			<w:r w:rsidRPr="00713F7D">
				<w:rPr>
					<w:rFonts w:ascii="微软雅黑" w:eastAsia="微软雅黑" w:hAnsi="微软雅黑" />
					<w:b />
					<w:color w:val="000000" />
					<w:sz w:val="39" />
					<w:szCs w:val="39" />
				</w:rPr>
				<w:t>Nutch相关框架</w:t>
			</w:r>
			<w:r w:rsidRPr="00713F7D">
				<w:rPr>
					<w:rFonts w:ascii="微软雅黑" w:eastAsia="微软雅黑" w:hAnsi="微软雅黑"
						w:hint="eastAsia" />
					<w:b />
					<w:color w:val="000000" />
					<w:sz w:val="39" />
					<w:szCs w:val="39" />
				</w:rPr>
				<w:t>视频教程</w:t>
			</w:r>
		</w:p>

... ...

格式比较简单明了

w:r被w:p包裹 w:p中有多个w:r 由他们组成一段话

w:r中的w:t是内容 w:rPr是一些格式参数

按照以上就能实现最低限度的要求了

要在docx4j里直接写的话从w:p开始就可以了要注意写上w的空间名

这里附上完整的处理代码:

				if (shape instanceof XSLFTextShape) {
					XSLFTextShape txShape = (XSLFTextShape) shape;

					// System.out.println("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+shape.getXmlObject());
					SAXReader reader = new SAXReader();
					org.dom4j.Document document = reader
							.read(new ByteArrayInputStream(
									("<?xml version=\"1.0\" encoding=\"utf-8\"?>" + txShape
											.getXmlObject().toString())
											.getBytes("utf-8")));
					StringBuffer sb = new StringBuffer();
					// 获得根
					Element root = document.getRootElement();
					try {
						// 获得文本信息
						Element txBody = root.element("txBody");

						Element p = txBody.element("p");

						sb.append("<w:p xmlns:w =\"http://schemas.openxmlformats.org/wordprocessingml/2006/main\">");
						
						List<Element> rs = p.elements("r");

						if (rs.size() == 0) {

							continue;
						}
						/*
						 * <w:r> <w:rPr> <w:rFonts w:ascii="微软雅黑"
						 * w:eastAsia="微软雅黑" w:hAnsi="微软雅黑" w:hint="eastAsia" />
						 * <w:b /> <w:color w:val="000000" /> <w:sz w:val="39"
						 * /> <w:szCs w:val="39" /> </w:rPr>
						 * <w:t>通过ivy来进行依赖管理（1.2之后）。</w:t> </w:r>
						 */
						for (Element r : rs) {
							sb.append("<w:r>");
							sb.append("<w:rPr>");
							Element erPr = r.element("rPr");
							RPr rpr = new RPr();
							String size = erPr.attributeValue("sz");
							
							Element ea = null;
							if ((ea = erPr.element("ea")) != null) {
								if (ea.attributeValue("typeface") != null) {
									sb.append("<w:rFonts w:ascii=\""+erPr.element("ea")+"\"  w:eastAsia=\""+erPr.element("ea")+"\" w:hAnsi=\""+erPr.element("ea")+"\" w:hint=\"eastAsia\" />");
								}

							}
							
							String b=erPr.attributeValue("b");
							
							String i=erPr.attributeValue("i");
							
							if(Objects.equals("1", b)){
								sb.append("<w:b />");
							}
							
							if(Objects.equals("1", i)){
								sb.append("<w:i />");
							}
							
							if (size != null) {
								System.out.println(size);
								int iSize=Integer.valueOf(size)/100;
                                sb.append("<w:sz w:val=\""+iSize+"\" /> <w:szCs w:val=\""+iSize+"\" />");
							}
							
							sb.append("</w:rPr>");
							Element t = r.element("t");
                            
							sb.append("<w:t>"+t.getText()+"</w:t>");
							sb.append("</w:r>");

							

						}
						sb.append("</w:p>");
//						sb.append("</w:body>");
						wordMLPackage.getMainDocumentPart()
						.addParagraph(sb.toString());
					} catch (Exception e) {
						e.printStackTrace();
						wordMLPackage.getMainDocumentPart().addParagraphOfText(
								txShape.getText());
					}

					System.out.println(txShape.getText());
}

0
顶

1
踩

分享到：

[转]java中已知字体和字体大小确定字体的 ... | [练习]erlang算法练习--KMP

2013-08-01 20:53
浏览 3659
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论