Why should we use Python ?
Python is a versatile and powerful programming language that has become increasingly popular in the field of data science. One of the main advantages of using Python for data science is its ease of use and readability, making it accessible to both novice and experienced programmers. Additionally, Python has a vast ecosystem of libraries and tools specifically designed for data science, including NumPy, Pandas, Matplotlib, and Scikit-learn, which greatly simplify data manipulation, analysis, and visualization.
Python Knowledge Base: Make coding great again.
- Updated:
2025-01-21 by Andrey BRATUS, Senior Data Analyst.
Numbers manipulations.
Variable Assignment.
Data types - Strings.
Printing.
Data types - Lists.
Data types - Dictionaries.
Data types - Booleans.
Data types - Tuples.
Data types - Sets.
Comparison Operators.
Logic Operators.
if,elif, else Statements.
for Loops.
while Loops.
range().
List comprehension.
Functions.
Lambda expressions - a small anonymous function.
Map and filter.
Useful methods.
Python's flexibility also allows for seamless integration with other technologies, such as SQL databases and Hadoop clusters. Moreover, Python is an open-source language, meaning it's free to use and has a large and supportive community of developers constantly contributing to its growth and improvement. Finally, Python's popularity in the data science community means that there are abundant resources available, including tutorials, forums, and online courses, making it easy to learn and master.
In last several years Python became a first-class tool for scientific analytical tasks, including the analysis and visualization of large data volumes.
The effectiveness of Python for data science is obvious from the language usage itself, but also due to large and active ecosystem of third-party packages, which are really really helpful for data manipulation, common scientific computing tasks, high-quality visualizations, interactive execution and sharing of code, machine learning and many more use cases.
Python's future in the realm of data science looks promising. As companies increasingly rely on data to inform their decision-making processes, the demand for skilled data scientists and analysts is expected to grow. Python's versatility and ease of use make it a top choice for data science projects. Additionally, the development of new libraries and tools specifically designed for data science, such as TensorFlow and PyTorch, will continue to expand Python's capabilities. With the rise of artificial intelligence and machine learning, Python's role in these fields is expected to become even more significant. Finally, Python's open-source nature and active community of developers ensure that it will continue to evolve and improve, making it a reliable choice for data science projects both now and in the future.
This cheat sheet provides a quick tour of the essential features of the Python language for data scientists willing to use basic to middle level Python techniques.
1 + 1
OUT: 2
1 * 3
OUT: 3
1 / 2
OUT: 0.5
2 ** 4
OUT: 16
6 % 2
OUT: 0
5 % 2
OUT: 1
(1 + 3) * (5 + 5)
OUT: 40
# Can not start with number or special characters
name_of_var = 2
x = 2
y = 3
z = x + y
z
OUT: 5
'single quotes'
"double quotes"
" wrap lot's of other quotes"
x = 'hello'
print(x)
OUT: hello
num = 12
name = 'Sam'
print('My number is: {one}, and my name is: {two}'.format(one=num,two=name))
OUT: My number is: 12, and my name is: Sam
print('My number is: {}, and my name is: {}'.format(num,name))
OUT: My number is: 12, and my name is: Sam
print(f'My number is: {num}, and my name is: {name}')
OUT: My number is: 12, and my name is: Sam
Lists are used to store multiple items in a single variable.
[1,2,3]
['hi',1,[1,2]]
my_list = ['a','b','c']
my_list.append('d')
my_list
OUT: ['a', 'b', 'c', 'd']
#indexing
my_list[0]
OUT: 'a'
#slicing
my_list[1:]
OUT: ['b', 'c', 'd']
#changing
my_list[0] = 'NEW'
OUT: ['NEW', 'b', 'c', 'd']
#nesting
nest = [1,2,3,[4,5,['target']]]
nest[3][2][0]
OUT: 'target'
Dictionaries are used to store data values in key:value pairs.
d = {'key1':'item1','key2':'item2'}
d['key1']
OUT: 'item1'
Booleans represent one of two values: True or False..
True
OUT: True
False
OUT: False
Tuples - collections which are ordered and unchangeable.
t = (1,2,3)
t[0]
OUT: 1
Sets - collections which are unordered, unchangeable*, and unindexed.
{1,2,3}
OUT: {1,2,3}
1 > 2
OUT: False
1 >= 1
OUT: True
1 == 1
OUT: True
'hi' == 'bye'
OUT: False
(1 == 2) and (2 == 3)
OUT: False
(1 == 2) or (2 == 3) or (4 == 4)
OUT: True
if 1 == 1:
print('Yep!')
OUT: Yep!
if 1 == 2:
print('first')
else:
print('last')
OUT: last
if 1 == 2:
print('first')
elif 3 == 3:
print('middle')
else:
print('Last')
OUT: middle
seq = [1,2,3,4,5]
for item in seq:
print(item)
OUT: 1
2
3
4
5
i = 1
while i < 5:
print('i is: {}'.format(i))
i = i+1
OUT: i is: 1
i is: 2
i is: 3
i is: 4
range(5)
list(range(5))
OUT: [0, 1, 2, 3, 4]
x = [1,2,3,4]
[item**2 for item in x]
OUT: [1, 4, 9, 16]
def my_func(param1='default'):
"""
Docstring goes here.
"""
print(param1)
my_func('new param')
OUT: new param
x = lambda a, b : a * b
print(x(5, 6))
OUT: 30
seq = [1,2,3,4,5]
list(map(lambda var: var*2,seq))
OUT: [2, 4, 6, 8, 10]
list(filter(lambda item: item%2 == 0,seq))
OUT: [2, 4]
st = 'Hello my name is Sam'
st.lower()
OUT: 'hello my name is sam'
st.upper()
OUT: 'HELLO MY NAME IS SAM'
st.split()
OUT: ['hello', 'my', 'name', 'is', 'Sam']
d.keys()
OUT: dict_keys(['key2', 'key1'])
d.items()
OUT: dict_items([('key2', 'item2'), ('key1', 'item1')])
lst = [1,2,3]
lst.pop()
OUT: 3
lst
OUT: [1, 2]
'x' in ['x','y','z']
OUT: True