Python String Format, String Format Specifiers, Escape Sequences, Raw Format
Table of Contents
Python String Format:
Python String Format-Python supports multiple ways to format text strings. These include %-formatting and str.format(). Each of these methods has their advantages, but in inclusion, have disadvantages that make them cumbersome to use in practice. A new python string format mechanism referred to as “f-strings” is becoming popular among Python community, taken from the leading character used to denote such python string format, and stands for “formatted strings”. The f-strings provide a way to embed expressions inside strings literals, using a minimal syntax. A python string format literal is what we see in the source code of a Python program, including the quotation marks. It should be noted that an f-string is really an expression evaluated at run time and not a constant value. In Python source code, an f-string is a literal string, prefixed with ‘f’, which contains expressions within curly braces ‘{‘ and ‘}’.
Amazon Purchase Links:
*Please Note: These are affiliate links. I may make a commission if you buy the components through these links. I would appreciate your support in this way!
The f-strings formatting is driven by the desire to have a simpler way to format strings in Python. The existing ways of python string format are either error-prone, inflexible, or cumbersome. The %-formatting is limited as to the types it supports. Only int, str and doubles can be formatted. All other types are either not accepted or converted to one of these types before formatting. In addition, there’s a well-known trap when a single value is passed. For example,
1 2 3 |
>>> almanac = 'nostradamus' >>> 'seer: %s' % almanac 'seer: nostradamus' |
works well when single value is passed. But if the variable almanac were ever to be a tuple, the same code would fail. For example,
1 2 3 4 5 |
>>> almanac = ('nostradamus', 1567) >>> 'seer: %s' % almanac Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: not all arguments converted during string formatting |
Passing Multiple values not supported in %-formatting. The str.format() formatting was added to address some of these problems with %-formatting. In particular, it uses standard function call syntax and therefore supports
multiple parameters. However, str.format() is not without its issues. Chief among them is its verbosity. For example, in the following code the text value is repeated.
1 2 3 |
>>> value = 4 * 20 >>> 'The value is {value}.'.format(value=value) 'The value is 80.' |
Too much verbosity Even in its simplest form, there is a bit of boilerplate, and the value that’s inserted into
the placeholder is sometimes far removed from where the placeholder is placed.
1 2 |
>>> 'The value is {}.'.format(value) 'The value is 80.' |
Statement is not informative.With an f-string, this becomes,
1 2 |
>>> f'The value is {value}.' 'The value is 80.' |
The python string format( f-strings) provide a concise, readable way to include the value of Python expressions inside strings. Backslashes may not appear inside the expression section of f-strings, so you cannot use them. Backslash escapes may appear inside the string portions of an python string format ( f-string). For example, to escape quotes inside f-strings:
1 2 3 |
>>> f'{\'quoted string\'}' File "<stdin>", line 1 SyntaxError: f-string expression part cannot include a backslash |
Backslashes are not supported within the curly braces when f-strings are used. You can use a various type of quote inside the expression:
1 2 |
>>> f'{"quoted string"}' 'quoted string' |
Use different types of quotes within and outside the curly braces.
Python String Format Specifiers:
Format specifiers may also contain evaluated expressions. The syntax for python string format (f-string) formatting operation is,
f’string_statements {variable_name [: {width}.{precision}]}’
The f character should be prefixed during f-string formatting. The string_statement is a string consisting of a sequence of characters. Within curly braces, you specify the variable_ name whose value will be displayed. Specifying width and precision values are optional. If they are specified, then both width and precision should be included within curly braces. Also, using variable_name along with either width or precision values should be separated by a colon. You can pad or create space around variable_name value element through width value. By delinquency, strings are left-justified and numbers are right-justified. Precision refers to the total number of digits that will be displayed in a number. This includes the decimal point and all the digits, i.e., before and after the decimal point. For example,
1 2 3 4 5 6 7 8 9 |
>>> width = 10 >>> precision = 5 >>> value = 12.34567 >>> f'result: {value:{width}.{precision}}' 'result: 12.346' >>> f'result: {value:{width}}' 'result: 12.34567' >>> f'result: {value:.{precision}}' 'result: 12.346' |
Different ways of string formatting in f-strings.
Python String Format Escape Sequences
Escape Sequences are a combination of a backslash () followed by either a letter or a combination of letters and digits. Escape sequences are also called as control sequences. The backslash () character is used to escape the meaning of characters that follow it by substituting their special meaning with an alternate interpretation. So, all escape sequences consist of two or more characters. Here is a list of several common escape sequences which is used in python string format is given below:
Escape sequence
|
Meaning |
\ | Break a line into multiple lines while ensuring the continuation of the line |
\\ | Insert a Backslash character in the string |
\’ | Inserts a Single quote character in the string |
\” | Inserts a Double Quote character in the string |
\n | Insert a new line in the string |
\t | Insert a Tab in the string |
\r | Insert a Carriage return in the string |
\b | Insert a backspace in the string |
\u | Insert a Unicode character in the string |
\000 | Insert a character in the string based on its octal value |
\xhh | Insert a character in the string based on its hex value |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
>>> print("You can break \ … single line to \ … multiple lines") You can break single line to multiple lines >>> print('print backslash \\ inside a string ') print backslash \ inside a string >>> print('print single quote \' within a string') print single quote ' within a string >>> print("print double quote \" within a string") print double quote " within a string >>> print("First line \nSecond line") First line Second line >>> print("tab\tspacing") tab spacing >>> print("same\rlike") like >>> print("He\bi") Hi >>> print("\u20B9") >>> print("\046") & >>> print("\x24") $ |
By placing a backslash (\) character at the end of the line, you can break a single line to multiple lines while ensuring continuation. It indicates that the next line is also part of the same statement . Print backslash by escaping the backslash itself . You can use the backslash (\) escape character to add single or double quotation marks in the python string format. The \n escape sequence is used to insert a new line without hitting the enter or return key. The part of the string after \n escape sequence appears in the next line. A horizontal indentation is provided with the \t escape sequence. Inserts a carriage return in the string by moving all characters after \r to the beginning of the string by overriding the exact number of characters that were moved. The \b escape sequence removes the previous character. A 16-bit hex value Unicode character is inserted in the python string format as. A character is inserted in the string based on its Octal 10 and Hex values .
Raw Python String Format:
A raw string is created by prefixing the character r to the string. In Python string format, a raw string ignores all types of formatting within a string including the escape characters.
>> print(r”I Says, \” that you are not a good boy.\””)
I Says, \” that you are not a good boy.\”
As you can see in the output, by constructing a raw string you can retain quotes, backslashes that are used to escape and other characters, as in.
Unicodes Python String Format:
Fundamentally, computers just deal with numbers. They save letters and other characters by assigning a number for each one. Before Unicode was created, there were hundreds of different systems, called character encodings for assigning these numbers. These early character encodings were limited and could not accommodate enough characters to cover all the world’s languages. Even for a simple language like English, no single encoding was adequate for all the letters, punctuation, and technical symbols in common use. The Unicode Standard provides a unique number for each character, no matter what platform, device, application or language. It has been accept by all modern software providers and now allows data to be transported through many different platforms, devices and applications without corruption. Unicode can be implemented by different character encodings. The Unicode Standard defines Unicode Transformation Formats like UTF-8, UTF-16, and UTF-32, and several other encodings are in use. The most commonly used encodings are UTF-8, UTF-16 and UCS-2 (Universal Coded Character Set), a precursor of UTF-16. UTF-8 is dominantly used by websites (over 90%), uses one byte for the first 128 code points and up to 4 bytes for other characters. Regular Python string format are not Unicode: they are just plain bytes. To create a Unicode string, use the ‘u’ prefix on the python string literal. For example,
>> unicode_string = u’A unicode \u018e string \xf1′
>> unicode_string
‘A unicode string ñ’
A Unicode python string format is a different type of object from regular “str” string type