Sequence padding and truncation

227. Sequence padding and truncation

easy

General

senior

Preprocessing text for NLP models often requires every token sequence to have the same length. Write a function that pads shorter sequences and truncates longer ones to a fixed max_len.

For a sequence $x$ of length $L$ and target length $M$ , define $L'=\min(L, M)$ and produce an output $y$ of length $M$ by copying $L'$ tokens (from the start or end) and filling the rest with pad_value.

Requirements

Implement the function

python

Rules:

Return a NumPy array of shape (batch_size, max_len).
The input seqs is a standard Python list of lists.
If a sequence is longer than max_len, truncate according to truncating ("pre" keeps the last max_len, "post" keeps the first max_len).
If a sequence is shorter than max_len, pad using pad_value according to padding ("pre" pads on the left, "post" pads on the right).
Do not modify the input seqs in-place.
Use only NumPy and Python built-in libraries.

Example

python

Output:

python

Input Signature

Argument	Type
seqs	list
max_len	int
padding	str
pad_value	int
truncating	str

Output Signature

Return Name	Type
value	np.ndarray

Constraints

Use only NumPy and Python built-ins
Do not modify input seqs in-place
Each output sequence length equals max_len

Hint 1

Start by deciding how to handle a single sequence: if len(seq) > max_len, slice it based on truncating (seq[:max_len] vs seq[-max_len:]).

Hint 2

Once you have the tokens to keep, compute how many pads you need: pad_count = max_len - len(tokens). Then build the final sequence by adding [pad_value] * pad_count either before or after depending on padding.

Hint 3

Using NumPy can simplify this: pre-allocate np.full((len(seqs), max_len), pad_value) and copy tokens into either padded[i, :n] (post) or padded[i, -n:] (pre). Return the NumPy array.

Roles

ML Engineer

AI Engineer

Data Scientist

Quantitative Analyst

Companies

General

Levels

senior

entry

Input Arguments

Edit values below to test with custom inputs

You need tolog in/sign upto run or submit