[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

html -> txt конвертиране и кодировка



Привет,

Тествах dump на HTML страница с CP1251 кодировка към текст с lynx.
  lynx  -dump -display_charset=windows-1251 \
	-assume_charset=windows-1251 blah.html
Работи перфектно, ключовата дума е -display_charset=windows-1251,
-assume_charset=windows-1251 е полезна в случай, че няма зададен
charset в HTML.
Промених share/mk/doc.docbook.mk и share/mk/doc.project.mk,
пробвах make FORMATS="html txt" и изглежда добре. diff на 
променените файлове в share/ е закачен към мейла.

Коментари? :)

Поздрави,
Виктор
-- 
Linux is for those who hate Windows.
FreeBSD is for those who love UNIX.
diff -urb work/doc/share/mk/doc.docbook.mk test/doc/share/mk/doc.docbook.mk
--- work/doc/share/mk/doc.docbook.mk    Wed Apr 21 04:43:16 2004
+++ test/doc/share/mk/doc.docbook.mk    Wed Apr 21 05:57:36 2004
@@ -237,8 +237,7 @@
 GROFF?=                groff
 TIDY?=         ${PREFIX}/bin/tidy
 TIDYOPTS?=     -wrap 90 -m -raw -preserve -f /dev/null -asxml ${TIDYFLAGS}
-HTML2TXT?=     ${PREFIX}/bin/lynx
-HTML2TXTFLAGS?= -display_charset=windows-1251 -assume_charset=windows-1251
+HTML2TXT?=     ${PREFIX}/bin/links
 HTML2TXTOPTS?= -dump ${HTML2TXTFLAGS}
 HTML2PDB?=     ${PREFIX}/bin/iSiloBSD
 HTML2PDBOPTS?= -y -d0 -Idef ${HTML2PDBFLAGS}
diff -urb work/doc/share/mk/doc.project.mk test/doc/share/mk/doc.project.mk
--- work/doc/share/mk/doc.project.mk    Wed Apr 21 04:45:52 2004
+++ test/doc/share/mk/doc.project.mk    Wed Apr 21 05:57:36 2004
@@ -78,7 +78,7 @@
 MKDIR?=                /bin/mkdir
 RM?=           /bin/rm
 MV?=           /bin/mv
-HTML2TXT?=     ${PREFIX}/bin/lynx
+HTML2TXT?=     ${PREFIX}/bin/links
 HTML2TXTOPTS?= -dump ${HTML2TXTFLAGS}
 ISPELL?=       ispell
 ISPELLOPTS?=   -l -p /usr/share/dict/freebsd ${ISPELLFLAGS}