So based on comment 3 I think the way to fix this is to get rid of the old CJK parallel state machine detectors in intl/chardet and just use the universal detector with a language filter. The universal detector has been much better maintained, and it will remove a lot of duplicated data.
The patch is a bit large and scary, but most of it is just moving XPCOM stuff around. Since time is short, I'm requesting code review already while I work on testcases.
Created attachment 313966
Patch
So based on comment 3 I think the way to fix this is to get rid of the old CJK parallel state machine detectors in intl/chardet and just use the universal detector with a language filter. The universal detector has been much better maintained, and it will remove a lot of duplicated data.
The patch is a bit large and scary, but most of it is just moving XPCOM stuff around. Since time is short, I'm requesting code review already while I work on testcases.
The patch doesn't include the cvs removes: src/Big5Statist ics.h src/EUCJPStatis tics.h src/EUCKRStatis tics.h src/EUCTWStatis tics.h src/GB2312Stati stics.h src/nsBIG5Verif ier.h src/nsCP1252Ver ifier.h src/nsEUCJPVeri fier.h src/nsEUCKRVeri fier.h src/nsEUCTWVeri fier.h src/nsGB18030Ve rifier. h src/nsGB2312Ver ifier.h src/nsHZVerifie r.h src/nsISO2022CN Verifier. h src/nsISO2022JP Verifier. h src/nsISO2022KR Verifier. h src/nsPSMDetect ors.cpp src/nsPSMDetect ors.h src/nsPkgInt. h src/nsSJISVerif ier.h src/nsUCS2BEVer ifier.h src/nsUCS2LEVer ifier.h src/nsUTF8Verif ier.h src/nsVerifier. h
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/
intl/chardet/