Skip to content

Coding style

Generally, we follow the PEP-8 standards. We also recommend to follow the Google Style Guide for python, as well as our own rules listed below. If PEP-8 contradicts our recommendations, use ours. If Google Style Guide contradicts PEP-8, use PEP-8.

Here's a short introduction to PEP-8 to get a basic understanding of PEP-8 standards without reading the whole thing.

Our Recommendations

Line length

  • Limit lines to a maximum of 100-120 characters to improve code readability. This can be controlled with PyCharm and PyLint configs discussed in the next sections.

Naming:

  • Use snake_case for variable and function names.
  • Use PascalCase for class names.
  • Use UPPERCASE_WITH_UNDERSCORE for constants and enums keys (e.g., LEARNING_RATE = 0.01).
  • Use descriptive names for variables, functions, classes, and modules.
  • Avoid abbreviations, i.e. features is features, not feats. It’s okay to use widely recognized ones, like lat for latitude.
  • Avoid single-character names. Some notable exceptions are loop variables, exception aliases in try except or variables in recognizable math formulas (including usage of certain variable names when implementing specific paper):
# Bad

def score(x, y, w):
  return w * sum(x ^ 2 - y ^ 2)


# Okay
import math


def distance(x, y):
  return math.sqrt(sum(x ^ 2 - y ^ 2))


# Bad
houses = list()
for i, j in enumerate(houses):
  pass

# Good
for i, house in enumerate(houses):
  pass

# Good
try:
  do_something()
except ValueError as e:
  raise ValueError('some_message') from e

# Good
e = m * c ^ 2
  • Avoid using names of builtins as variable/function names. If you really need to, add an underscore from the end: id-> id_, format-> format_. It's more readable than using short or corrupted versions, like frmt for format.
  • If there’s a context, don’t overdo: if the function is named process_home_data, no need to include home_ prefix to each argument:
# Bad

def process_home_data(home_price, home_area, home_bedroom_num):


# Good

def process_home_data(price, area, bedroom_num):


# Bad

class Network:
  network_base = ResNet50
  network_classification_head = Dense
  network_augmentation_layers = [Rescaling, Normalization]


# Good

class Network:
  base = ResNet50
  classification_head = Dense
  augmentation_layers = [Rescaling, Normalization]

Imports

  • Import modules and packages in separate lines.
  • Use absolute imports for external libraries, i.e. no from . import mymodule, but from package import mymodule
  • Avoid wildcard imports: from module import *
  • Use aliases during import aka import x as y or from z import x as y only in following cases:
  • if there are several modules with the same and you need to distinguish them
  • if the alias y is wildly recognized, i.e. import tensorflow as tf, import numpy as np
  • y is a very long and inconvenient to use
  • y is too generic, i.e. from package import settings can be replaced by from package import settings as package_settings
  • Group imports in the following order (handled by PyCharms optimize imports):
  • Standard library imports
  • Third-party library imports
  • Local module imports

Docstrings and Comments

  • Use docstrings to provide clear explanations for classes, functions, and modules. Ideally the new developer should be able to understand what the function does without looking at the code using docstring alone.
  • Use inline comments to clarify complex code sections or provide context. Don't overuse them - if you feel the need to write a comment, maybe there's a better way to refactor the code instead.
  • Use NUMPY DOCUMENTATION STYLE so we can use mkdocs to autogenerate a documentation for API. In PyCharm, it can be configured from Settings->Tools->Python integrated tools tab

Configure numpy docstrings from PyCharm settings tab

Function and Method Definitions

  • Keep functions and methods small and focused on a single task (following the Single Responsibility Principle).
  • Limit the number of function arguments - ideally less than 5. One way to reduce the number of arguments is to group them into objects usingpydantic models or dataclasses. This will increase readability and will ease up the future modifications
from pydantic import BaseModel


# Bad

def generate_ad_copy(price, area, property_type, subtype, features, lat, lng, num_sentences):


# Good

class House(BaseModel):
  price: float
  area: float
  property_type: str
  subtype: str
  features: list
  lat: float
  lng: float


def generate_ad_copy(house, num_sentences):


# Best

class Location(BaseModel):
  lat: float
  lng: float


class House(BaseModel):
  location: Location
  ...


def generate_ad_copy(house, num_sentences)

  • Avoid side effects, like mutating the inputs or modifying global variables. If you need to make some changes to the input, copy it and return the value instead.
  • Avoid multiple nested blocks of code. They are hard to read and maintain. One way to reduce the number of nested blocks is to handle edge cases at the start of the method/function definition:
# Bad
def calculate_property_tax(value):
  if value <= 100000:
    return value * 0.01
  else:
    if value <= 500000:
      return value * 0.02
    else:
      if value <= 1000000:
        return value * 0.03
      else:
        return value * 0.05


# Okay

def calculate_property_tax(value):
  if value <= 100000:
    return value * 0.01
  elif value <= 500000:
    return value * 0.02
  elif value <= 1000000:
    return value * 0.03
  else:
    return value * 0.05


# Good

def calculate_property_tax(value):
  if value <= 100000:
    return value * 0.01
  if value <= 500000:
    return value * 0.02
  if value <= 1000000:
    return value * 0.03
  return value * 0.05

Type Annotations

  • Use type annotations to indicate function parameter types and return types.
  • Use Python's typing module for more complex type hinting.
  • Use Mypy linter to check type hinting.

Exception Handling

  • Handle specific exception types instead of catching general exceptions, e.g., except ValueError instead of except Exception.
  • You can catch generic Exception if you intend to reraise it. For example, in API endpoints you can catch Exception and raise an 500 Internal Server Error instead
# Fastapi example
class ExceptionHandlerMiddleware(BaseMiddleware):
  def dispatch(self, request, endpoint):
    try:
      endpoint(request)
    except Exception as e:
      raise HTTPException(str(e), status_code=500)
  • You intend to suppress the exceptions and continue running the program. For example, you can log the exception details in a log file without stopping the application.
  • Handle exceptions gracefully, providing meaningful error messages.
  • For FastAPI applications consider using Middlewares for exception handling.

String Formatting

  • Use f-strings (formatted string literals) for string interpolation when possible.
  • For complex formatting, use the str.format() method.

Service configuration

  • Do not use any hard-coded variables and magic numbers
# Bad
def calculate_discount(price):
  if price > 100:
    return price * 0.9  # 0.9 is a magic number representing a 10% discount
  else:
    return price


# Good
DISCOUNT_RATE = 0.9


def calculate_discount(price):
  if price > 100:
    return price * DISCOUNT_RATE
  else:
    return price

  • Keep constants, and config files in a separate file
  • Consider using the .yaml extension for config files
  • Usepydantic to parse config files
  • Do not share any secret variables in the code nor the repo generally
  • Keep secrets in the .env. Do not push .env files to GIT. For sharing .env files, share them directly with other code contributors. (This requirement will be changed after the integration of secrets management tools like Doppler.)
  • Usepydantic to parse .env files

Other

  • Consider using pydantic or dataclasses instead of dictionaries. This will improve code readability, will enable typing hints, and will serve as an additional validation layer.
from pydantic import BaseModel


# Bad
def have_same_name(user1: dict, user2: dict) -> bool:
  return user1['name'] == user2['name']


# Good
class User(BaseModel):
  name: str
  ...


def have_same_name(user1: User, user2: User) -> bool:
  return user1.name == user2.name
  • Use aliases in pydantic.Field for API request/response models so they are usable with camelCase variable definitions.