Scientific Computing with Python
Python has become the de facto language for scientific computing, largely thanks to NumPy and SciPy. Let’s explore these powerful libraries.
Why NumPy?
NumPy provides:
- Fast array operations
- Memory-efficient data structures
- Mathematical functions
- Linear algebra operations
Installation
pip install numpy scipy matplotlib
NumPy Basics
Creating Arrays
import numpy as np
# Create arrays
a = np.array([1, 2, 3, 4, 5])
b = np.zeros((3, 3))
c = np.linspace(0, 10, 100)
d = np.random.randn(5, 5)
print(f"Array shape: {a.shape}")
print(f"Data type: {a.dtype}")
Array Operations
# Element-wise operations
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
z = x + y # [5, 7, 9]
w = x * y # [4, 10, 18]
# Matrix operations
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
C = np.dot(A, B) # Matrix multiplication
D = A @ B # Alternative syntax
Indexing and Slicing
arr = np.arange(0, 100, 5)
# Slicing
print(arr[0:5]) # First 5 elements
print(arr[-3:]) # Last 3 elements
print(arr[::2]) # Every other element
# Boolean indexing
mask = arr > 50
print(arr[mask]) # Elements greater than 50
SciPy for Advanced Operations
Optimization
from scipy.optimize import minimize
def objective(x):
return x[0]**2 + x[1]**2 + 10
result = minimize(objective, x0=[1, 1])
print(f"Minimum at: {result.x}")
Integration
from scipy.integrate import quad
def integrand(x):
return np.exp(-x**2)
result, error = quad(integrand, 0, np.inf)
print(f"Integral: {result:.6f}")
Signal Processing
from scipy import signal
# Create a signal
t = np.linspace(0, 1, 1000)
sig = np.sin(2 * np.pi * 5 * t) + np.sin(2 * np.pi * 10 * t)
# Apply filter
b, a = signal.butter(4, 0.1)
filtered = signal.filtfilt(b, a, sig)
Interpolation
from scipy.interpolate import interp1d
x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 1, 4, 9, 16])
f = interp1d(x, y, kind='cubic')
x_new = np.linspace(0, 4, 100)
y_new = f(x_new)
Practical Example: Analyzing Simulation Data
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
# Load trajectory data
data = np.loadtxt('trajectory.dat')
# Calculate statistics
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
# Perform statistical test
statistic, pvalue = stats.ttest_1samp(data[:, 0], 0)
# Correlation analysis
correlation = np.corrcoef(data.T)
print(f"Mean: {mean}")
print(f"Std Dev: {std}")
print(f"P-value: {pvalue}")
Linear Algebra
# Solve linear system Ax = b
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
x = np.linalg.solve(A, b)
print(f"Solution: {x}")
# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print(f"Eigenvalues: {eigenvalues}")
# Matrix decomposition
U, s, Vt = np.linalg.svd(A)
Performance Tips
- Vectorize operations instead of loops
- Pre-allocate arrays when possible
- Use in-place operations to save memory
- Choose appropriate data types
- Use NumPy functions instead of Python built-ins
Example: Vectorization
# Slow (Python loop)
result = []
for i in range(1000000):
result.append(i ** 2)
# Fast (NumPy vectorized)
result = np.arange(1000000) ** 2
Common Patterns
Broadcasting
# Add a vector to each row of a matrix
matrix = np.ones((3, 3))
vector = np.array([1, 2, 3])
result = matrix + vector # Broadcasting
Reduction Operations
data = np.random.randn(100, 50)
# Operations along axes
row_means = np.mean(data, axis=1)
col_sums = np.sum(data, axis=0)
total = np.sum(data)
Conclusion
NumPy and SciPy form the foundation of scientific computing in Python. Master these tools to write efficient, readable code for your research.
For more advanced topics, explore:
- pandas for data analysis
- matplotlib for visualization
- scikit-learn for machine learning
Ready to optimize your code? Check out our performance tuning guide!