Python: Generating pdf containing emojis using reportlab
Emojis 😃 🎉 👍 💕 have become so popular. Anyone who used Whatsapp or Instagram knows how cool emojis are 😎. And they are everywhere now. Emoji originated in Japan. The word “emoji” means “Picture(e) letter(moji)” in Japanese.
Technically, emojis are subset of Unicode character set. In short, Unicode is the collection of every writable symbol available on the planet. That being said, unicode includes symbols from every language that exist.
There are different types of encoding used to interpret different sets of unicode. Most popular is “UTF-8” encoding. To understand how unicode are interpreted, stored and retrieved here is a great article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Every emoji is a single unicode character that can be upto 4 bytes long depending on type of encoding used. Yes, every character is just one byte is no longer true.
The letter ‘A’ corresponds to ‘65’ in decimal and ‘0x41’ in hex. Similarly emoji 😃 corresponds to ‘128515’ in decimal and ‘0x1F603’ in hexadecimal. Though every emoji(unicode) can be represented in decimal or hexadecimal, the standard way to denote emoji is by ‘code point’.
Code point for 😃 is U+1F603. An emoji(unicode) in code point is represented as “U+” (to denote unicode) followed by its
hex value “1F603”. In Python it is represented as
"\U0001f603" (u"\U0001f603" in python 2).
Here is a list of emojis with its code points.
Every emoji have a name. Some apps like Github, Slack, Basecamp, etc, provide short name for emoji to make it easy to remember. You can find the list of emoji with its short name here emoji-cheat-sheet
Interpreting with Python
Get the name and category of an emoji using
unicodedata module of python
For more info on
Emojis are becoming popular. You happened to see emojis is many web apps and mobile apps. Rendering these emojis inside a browser or an app is another problem.
If you are able to see all colorful emojis I have used in this blog, then your browser supports emojis well. Recently in our company we happend to work on some features related to printing text containing emojis to pdf.
Wait.. But how do we render these emojis in pdf?. The problem boils-down to type of font you are using.
Printing emojis to pdf
We are going to use
reportlab python library to generate pdf
Note: All the code samples used here are tested using python 3.4
Here is a simple example that writes text containing emoji to pdf
Here is the preview of how it looks
What!! this is obviously not we wanted. Unsupported emojis are shown as “black-box” by default font(Helvetica-Bold)
Lets try with custom font that supports emoji symbols.
Using custom font
Symbola font supports most of the emoji symbols.
Simple program that writes emoji content to pdf using custom font(Symbola)
Here is the preview of how it looks with Symbola font.
Cool. Symbola print these emojis as promised. But Wait! can’t we get colorful emoji like we see in browser?. Yes. here comes the Emojione
Intro to Emojione
Emojione provides collection of wide range of emojis(Almost every emojis supported by Instagram, Facebook, Slack, etc..)
Collection includes all the png and svg image files for every emoji they support. Emojione provide following cool features
Any emoji can be coverted from it’s
- Unicode to corresponding PNG or SVG image
- Short name to corresponding PNG or SVG image
- Unicode to short name (and vice versa)
- Ascii symbol to corresponding PNG or SVG image( say, convert :) to 😄)
But unfortunately they doesn’t provide any library for python. So we have written one by ourselves 👍 and we call it Emojipy
Inserting emoji using Emojipy and reportlab
Usually any app(web or mobile) render emojis by replacing its unicode with corresponding PNG or SVG image(thats why you could see all the colorful emojis present in this blog).
Lets do the same thing with pdf
Below is the simple python program that use
emojipy library to render emoji in pdf
Here is the preview how it looks with
Wooo!! This is what we wanted!!