Data Schemas Reference
This reference provides detailed documentation of all data schemas used in the Vysion API responses. Understanding these schemas is essential for proper data parsing and integration.
Core Response Structure
Section titled “Core Response Structure”All Vysion API responses follow a consistent wrapper format:
{ "data": { "total": 0, "hits": [] }, "error": null}
Field | Type | Description |
---|---|---|
data | object/null | Contains the response data or null if error occurred |
data.total | integer | Total number of results available |
data.hits | array | Array of result objects |
error | object/null | Error information or null if successful |
Document Schemas
Section titled “Document Schemas”DocumentHit Schema
Section titled “DocumentHit Schema”Used in document search and retrieval endpoints.
{ "page": { "id": "string", "url": { "url": "string", "networkProtocol": "string", "domainName": "string", "port": 0, "path": "string", "signature": "uuid", "network": "tor" }, "foundAt": "string", "pageTitle": "string", "language": "en", "html": "string", "text": "string", "sha1sum": "string", "sha256sum": "string", "ssdeep": "string", "detectionDate": "2019-08-24T14:15:22Z", "screenshot": "string", "chunk": false }, "tag": [ { "namespace": "string", "predicate": "string", "value": "string" } ], "email": [], "paste": [], "skype": [], "telegram": [], "whatsapp": [], "bitcoin_address": [ { "value": "string" } ], "polkadot_address": [], "ethereum_address": [], "monero_address": [], "ripple_address": [], "zcash_address": []}
Page Object
Section titled “Page Object”Field | Type | Required | Description |
---|---|---|---|
id | string | true | Unique document identifier |
url | object | true | URL information object |
foundAt | string | false | Source where document was found |
pageTitle | string | false | Title of the webpage |
language | string | false | Detected language (ISO 639-1) |
html | string | false | Raw HTML content |
text | string | false | Extracted text content |
sha1sum | string | false | SHA1 hash of content |
sha256sum | string | false | SHA256 hash of content |
ssdeep | string | false | Fuzzy hash for similarity detection |
detectionDate | string | true | ISO 8601 timestamp |
screenshot | string | false | Screenshot URL if available |
chunk | boolean | false | Whether this is a partial document |
URL Object
Section titled “URL Object”Field | Type | Description |
---|---|---|
url | string | Complete URL |
networkProtocol | string | Protocol (http, https) |
domainName | string | Domain name |
port | integer | Port number |
path | string | URL path |
signature | string | URL signature (UUID) |
network | string | Network type (tor, clearnet) |
Tag Object
Section titled “Tag Object”Field | Type | Description |
---|---|---|
namespace | string | Tag namespace |
predicate | string | Tag predicate |
value | string | Tag value |
Ransomware Schemas
Section titled “Ransomware Schemas”RansomwareHit Schema
Section titled “RansomwareHit Schema”Used for ransomware victim data.
{ "page": { "id": "string", "url": { "url": "string", "networkProtocol": "string", "domainName": "string", "port": 0, "path": "string", "signature": "uuid", "network": "tor" }, "foundAt": "string", "pageTitle": "string", "language": "en", "detectionDate": "2019-08-24T14:15:22Z" }, "tag": [], "ransomwareGroup": "string", "companyName": "string", "companyAddress": "string", "companyLink": "string", "country": "string", "naics": "string", "industry": "string"}
Field | Type | Description |
---|---|---|
page | object | Page information (see DocumentHit) |
tag | array | Associated tags |
ransomwareGroup | string | Name of ransomware group |
companyName | string | Victim company name |
companyAddress | string | Company address |
companyLink | string | Company website |
country | string | Country code |
naics | string | NAICS industry code |
industry | string | Industry description |
Stat Schema
Section titled “Stat Schema”Used for statistics endpoints.
{ "key": "string", "doc_count": 0}
Field | Type | Description |
---|---|---|
key | string | Category identifier |
doc_count | integer | Number of occurrences |
AggStats Schema
Section titled “AggStats Schema”Used for aggregated statistics with sub-categories.
{ "key": "string", "doc_count": 0, "key_as_string": "string", "agg": { "buckets": [ { "key": "string", "doc_count": 0 } ] }}
Field | Type | Description |
---|---|---|
key | string | Primary category key |
doc_count | integer | Total count for category |
key_as_string | string | Human-readable key |
agg.buckets | array | Sub-category breakdowns |
Instant Messaging Schemas
Section titled “Instant Messaging Schemas”ImMessageHit Schema
Section titled “ImMessageHit Schema”Used for Telegram and Discord message data.
{ "userId": 0, "username": "string", "channelId": 0, "messageId": "string", "message": "string", "channelTitle": "string", "languages": [ { "language": "string", "probability": 0.95 } ], "sha1sum": "string", "sha256sum": "string", "media": "string", "detectionDate": "2019-08-24T14:15:22Z", "serverId": "string", "serverTitle": "string", "platform": "telegram"}
Field | Type | Description |
---|---|---|
userId | integer/string | User identifier |
username | string | Username on platform |
channelId | integer/string | Channel/group identifier |
messageId | string | Unique message ID |
message | string | Message content |
channelTitle | string | Channel/group name |
languages | array | Detected languages with confidence |
sha1sum | string | Message content hash |
sha256sum | string | Message content hash |
media | string | Media type if present |
detectionDate | string | ISO 8601 timestamp |
serverId | string | Discord server ID (Discord only) |
serverTitle | string | Discord server name (Discord only) |
platform | string | Platform name |
ImProfileHit Schema
Section titled “ImProfileHit Schema”Used for user profile data.
{ "userId": 0, "usernames": ["string"], "firstName": ["string"], "lastName": ["string"], "detectionDate": "2019-08-24T14:15:22Z", "profilePhoto": ["string"], "platform": "telegram", "email": [{"value": "string"}], "telegram": [{"value": "string"}], "whatsapp": [{"value": "string"}], "bitcoin_address": [{"value": "string"}], "ethereum_address": [{"value": "string"}], "monero_address": [{"value": "string"}], "ripple_address": [{"value": "string"}], "zcash_address": [{"value": "string"}], "polkadot_address": [{"value": "string"}]}
Field | Type | Description |
---|---|---|
userId | integer | User identifier |
usernames | array | Known usernames |
firstName | array | Known first names |
lastName | array | Known last names |
detectionDate | string | ISO 8601 timestamp |
profilePhoto | array | Profile photo URLs |
platform | string | Platform name |
email | array | Associated email addresses |
telegram | array | Telegram handles |
whatsapp | array | WhatsApp numbers |
*_address | array | Cryptocurrency addresses |
ImChannelHit Schema
Section titled “ImChannelHit Schema”Used for channel/group information.
{ "channelId": 0, "channelTitles": ["string"], "detectionDate": "2019-08-24T14:15:22Z", "creationDate": "2019-08-24T14:15:22Z", "channelPhoto": ["string"]}
Field | Type | Description |
---|---|---|
channelId | integer | Channel identifier |
channelTitles | array | Known channel names |
detectionDate | string | When detected |
creationDate | string | Channel creation date |
channelPhoto | array | Channel photo URLs |
ImServerHit Schema
Section titled “ImServerHit Schema”Used for Discord server information.
{ "serverId": 0, "serverTitles": ["string"], "detectionDate": "2019-08-24T14:15:22Z", "creationDate": "2019-08-24T14:15:22Z", "serverPhoto": ["string"], "memberCount": 0, "discordLink": ["string"]}
Field | Type | Description |
---|---|---|
serverId | integer | Server identifier |
serverTitles | array | Known server names |
detectionDate | string | When detected |
creationDate | string | Server creation date |
serverPhoto | array | Server icon URLs |
memberCount | integer | Number of members |
discordLink | array | Discord invite links |
Feed Schemas
Section titled “Feed Schemas”RansomFeedHit Schema
Section titled “RansomFeedHit Schema”Used in daily ransomware feeds.
{ "id": "string", "ransomwareGroup": "string", "companyName": "string", "companyAddress": "string", "companyLink": "string", "country": "string", "naics": "string", "industry": "string", "detectionDate": "2019-08-24T14:15:22Z"}
ImFeedHit Schema
Section titled “ImFeedHit Schema”Used in Telegram channel feeds.
{ "id": "string", "telegram": ["string"], "detectionDate": "2019-08-24T14:15:22Z", "url": "string", "path": "string", "network": "string"}
Common Data Types
Section titled “Common Data Types”Language Object
Section titled “Language Object”{ "language": "en", "probability": 0.95}
Field | Type | Description |
---|---|---|
language | string | ISO 639-1 language code |
probability | number | Confidence score (0-1) |
Address Object
Section titled “Address Object”{ "value": "string"}
Used for all cryptocurrency addresses and contact information.
Error Object
Section titled “Error Object”{ "code": 400, "message": "string"}
Field | Type | Description |
---|---|---|
code | integer | HTTP status code |
message | string | Error description |
ValidationError Object
Section titled “ValidationError Object”{ "detail": [ { "loc": ["string"], "msg": "string", "type": "string" } ]}
Field | Type | Description |
---|---|---|
detail | array | Array of validation errors |
detail[].loc | array | Field location path |
detail[].msg | string | Error message |
detail[].type | string | Error type |
Enumerated Values
Section titled “Enumerated Values”Network Types
Section titled “Network Types”Value | Description |
---|---|
tor | Tor hidden service |
clearnet | Standard internet |
Language Codes
Section titled “Language Codes”The API uses ISO 639-1 language codes. Common values include:
Code | Language |
---|---|
en | English |
es | Spanish |
fr | French |
de | German |
ru | Russian |
zh | Chinese |
ar | Arabic |
Platform Types
Section titled “Platform Types”Value | Description |
---|---|
telegram | Telegram messaging |
discord | Discord messaging |
Schema Validation
Section titled “Schema Validation”All schemas are validated server-side. Common validation rules include:
- Required fields: Must be present in the response
- Type validation: Fields must match specified types
- Format validation: Dates must be ISO 8601, UUIDs must be valid format
- Range validation: Numeric fields may have min/max constraints
Best Practices
Section titled “Best Practices”Handling Optional Fields
Section titled “Handling Optional Fields”# Safe field accessdef get_document_title(document_hit): page = document_hit.get('page', {}) return page.get('pageTitle', 'Unknown Title')
Type Checking
Section titled “Type Checking”# Validate data typesdef validate_message_hit(hit): required_fields = ['userId', 'channelId', 'messageId', 'detectionDate'] for field in required_fields: if field not in hit: raise ValueError(f"Missing required field: {field}")
Date Parsing
Section titled “Date Parsing”from datetime import datetime
def parse_detection_date(date_string): return datetime.fromisoformat(date_string.replace('Z', '+00:00'))