SAP provides the CCC converter program to convert characters from an encoding to another one.

Table of contents

Character encoding (aka code page)

Character encoding is a name ("utf-8", "iso-8859-1", etc.) and an equivalence table with a set of characters and octet values for each of these characters.

Code page is the name that SAP uses instead of character encoding. Code pages have a 4-digit number instead of a character name.

Equivalences between Character encoding international name and SAP code page number

Some SAP programs expect:

either a 4 characters code: you then have to enter the SAP code page number
- You may find the SAP code page number from the international character encoding name by calling SCP_CODEPAGE_BY_EXTERNAL_NAME function module. Or you may look at TCP00A database table.
or a 20 characters code: usually, you may either enter character encoding or SAP code page. Usually character encoding case is ignored.

Examples of a few equivalences:

SAP code page	Character encoding international name
124	IBM EBCDIC 00697/00297
1100	iso-8859-1
1105	US-ASCII (7 bits)
1160	windows-1252
4102	utf-16be
4103	utf-16le
4110	utf-8
8000	Shift-JIS
8300	BIG5

Usual problems with Character encoding conversion

Converting from one code page to another may be not possible for all characters of the source code page, because they do not exist in the target codepage.
- For example, converting from big5 (Chinese) to us-ascii makes no sense. If you think that it should be possible, then you probably didn't choose the right .
- In that case, we have to provide a replacement character to the CCC converter
Sequence of bytes is not recognized as a character in the source code page. It means that:
- either the sender program does not respect the code page (then ask the sender program to correct the error)
- or you should choose another code page (sometimes, differences between code pages are very little)
- or your program has erroneously shortened input bytes, last input byte(s) does mean nothing.
- For example, the 2 only bytes D8 00 mean nothing in utf-16le: 2 following bytes are expected to be able to identify the character (here encoded on 4 bytes).

How to call the CCC converter

CCC converter is a kernel program which may be accessed by several programs:

CL_ABAP_CODEPAGE class, available since 7.02. The code page cannot be the SAP number, it must be either the "Character encoding international name", or the name as used in java language.
CL_ABAP_CONV_* classes, since 6.10, where CL_ABAP_CONV_OBJ is the master class which gives full access to CCC converter. There are also these classes which call CCC converter with default values:
- CL_ABAP_CONV_IN_CE: converts bytes representing characters in a given codepage into a character or string variable
- CL_ABAP_CONV_OUT_CE: converts a character or string variable into bytes representing characters in a given codepage
- CL_ABAP_CONV_X2X_CE: converts bytes representing characters in a given codepage, into bytes representing characters in another given codepage
SCP_TRANSLATE_CHARS function module, works with all releases

Note: CCC stands for Character set Conversion Cache, a memory area where SAP stores the code pages it needs for conversions.

Links

SDN blog - BSP - a Developer's Journal: Part VII - Dealing with multiple languages (English, German, Spanish, Thai, and Polish), by Thomas Jung
What is Unicode
Unicode Transformation Format
SAP library:
- Internationalization
- Character codes: short explanation of character encoding
- Data conversion: short explanation of conversion possibilities in ABAP

Character encoding (aka code page)

Equivalences between Character encoding international name and SAP code page number

Usual problems with Character encoding conversion

How to call the CCC converter

Links

3 Comments

Former Member

Paolo Baruffaldi

Marco SILVA