ciberlabreport.preprocesing.virustotal module

VirusTotal preprocessing helpers and normalization routines.

This module wraps VirusTotal API access, normalizes raw responses into a compact structure, and derives ransomware indicators from multiple signals. It also defines dataclasses used to represent the normalized data payload.

Environment variables:

MODE (str): When set to “DEV”, raw and normalized outputs are stored on disk. TMP_PATH (str, optional): Base directory used to store debug JSON files.

class ciberlabreport.preprocesing.virustotal.VTConfig(search_url: str, high_threshold: int, medium_threshold: int, ransom_pattern: Pattern, evasion_keywords: Set[str])

Bases: object

Modificable configuration used in the wrapper.

evasion_keywords: Set[str]
high_threshold: int
medium_threshold: int
ransom_pattern: Pattern
search_url: str
class ciberlabreport.preprocesing.virustotal.VTFinalData(sha256: str, verdict: VTVerdict, threat_profile: VTThreatProfile, sandbox_signals: VTSandboxSignals, yara_summary: VTYaraSummary, is_ransomware: bool, ransomware_evidence: List[str] = <factory>)

Bases: object

Normalized VirusTotal data payload returned by the wrapper.

is_ransomware: bool
ransomware_evidence: List[str]
sandbox_signals: VTSandboxSignals
sha256: str
threat_profile: VTThreatProfile
verdict: VTVerdict
yara_summary: VTYaraSummary
class ciberlabreport.preprocesing.virustotal.VTSandboxSignals(malicious_sandboxes: List[str] = <factory>, malware_names: List[str] = <factory>, evasion_indicators: List[str] = <factory>)

Bases: object

Sandbox-based indicators extracted from VT verdicts and tags.

evasion_indicators: List[str]
malicious_sandboxes: List[str]
malware_names: List[str]
class ciberlabreport.preprocesing.virustotal.VTThreatProfile(family: str | None, family_confidence: str, categories: List[str] = <factory>, suggested_label: str | None = None)

Bases: object

Threat family and category profile derived from VT metadata.

categories: List[str]
family: str | None
family_confidence: str
suggested_label: str | None = None
class ciberlabreport.preprocesing.virustotal.VTVerdict(is_malicious: bool, malicious_engines: int, suspicious_engines: int, confidence_level: str)

Bases: object

Summary of VirusTotal engine verdicts for a file hash.

confidence_level: str
is_malicious: bool
malicious_engines: int
suspicious_engines: int
class ciberlabreport.preprocesing.virustotal.VTYaraSummary(highlights: List[str] = <factory>)

Bases: object

YARA rule match highlights from VT crowdsourced results.

highlights: List[str]
class ciberlabreport.preprocesing.virustotal.VirusTotalWrapper(config_path: Path, api_keys: list, mode: str, tmp_path: Path)

Bases: object

Client wrapper to query VirusTotal and normalize responses.

call(filehash: str) dict

Fetch VirusTotal data for a given file hash.

Parameters:

filehash (str) – SHA256 hash to query in VirusTotal.

Returns:

Raw VirusTotal API response (or empty dict on failure).

Return type:

dict

check_store_result(out: dict, outname: str) None

Optionally store debug JSON data when running in DEV mode.

Parameters:
  • out (dict) – Data to serialize to disk.

  • outname (str) – Filename appended to the TMP_PATH directory.

get_data(raw: dict) dict

Fetch and normalize VirusTotal data for an input report payload.

Parameters:

raw (dict) – Raw input JSON containing target file metadata.

Returns:

Normalized VirusTotal data. Empty dict if something fails.

Return type:

dict

normalize(filehash: str, vt_raw: dict) dict

Normalize a VirusTotal response into the internal data model.

This method extracts verdict counts, threat profiles, sandbox signals, YARA highlights, and ransomware evidence into a structured dict.

Parameters:
  • filehash (str) – SHA256 hash tied to the VT response.

  • vt_raw (dict) – Raw VirusTotal API response data.

Returns:

Normalized VirusTotal data as a plain dictionary.

Return type:

dict

read_hash_from_raw(raw: dict) str

Extract the SHA256 hash from the input report payload.

Parameters:

raw (dict) – Original JSON input with a nested target/file section.

Returns:

SHA256 hash string if present; otherwise an empty string.

Return type:

str