Skip to content

Data Schemas Reference

This reference provides detailed documentation of all data schemas used in the Vysion API responses. Understanding these schemas is essential for proper data parsing and integration.

All Vysion API responses follow a consistent wrapper format:

{
"data": {
"total": 0,
"hits": []
},
"error": null
}
FieldTypeDescription
dataobject/nullContains the response data or null if error occurred
data.totalintegerTotal number of results available
data.hitsarrayArray of result objects
errorobject/nullError information or null if successful

Used in document search and retrieval endpoints.

{
"page": {
"id": "string",
"url": {
"url": "string",
"networkProtocol": "string",
"domainName": "string",
"port": 0,
"path": "string",
"signature": "uuid",
"network": "tor"
},
"foundAt": "string",
"pageTitle": "string",
"language": "en",
"html": "string",
"text": "string",
"sha1sum": "string",
"sha256sum": "string",
"ssdeep": "string",
"detectionDate": "2019-08-24T14:15:22Z",
"screenshot": "string",
"chunk": false
},
"tag": [
{
"namespace": "string",
"predicate": "string",
"value": "string"
}
],
"email": [],
"paste": [],
"skype": [],
"telegram": [],
"whatsapp": [],
"bitcoin_address": [
{
"value": "string"
}
],
"polkadot_address": [],
"ethereum_address": [],
"monero_address": [],
"ripple_address": [],
"zcash_address": []
}
FieldTypeRequiredDescription
idstringtrueUnique document identifier
urlobjecttrueURL information object
foundAtstringfalseSource where document was found
pageTitlestringfalseTitle of the webpage
languagestringfalseDetected language (ISO 639-1)
htmlstringfalseRaw HTML content
textstringfalseExtracted text content
sha1sumstringfalseSHA1 hash of content
sha256sumstringfalseSHA256 hash of content
ssdeepstringfalseFuzzy hash for similarity detection
detectionDatestringtrueISO 8601 timestamp
screenshotstringfalseScreenshot URL if available
chunkbooleanfalseWhether this is a partial document
FieldTypeDescription
urlstringComplete URL
networkProtocolstringProtocol (http, https)
domainNamestringDomain name
portintegerPort number
pathstringURL path
signaturestringURL signature (UUID)
networkstringNetwork type (tor, clearnet)
FieldTypeDescription
namespacestringTag namespace
predicatestringTag predicate
valuestringTag value

Used for ransomware victim data.

{
"page": {
"id": "string",
"url": {
"url": "string",
"networkProtocol": "string",
"domainName": "string",
"port": 0,
"path": "string",
"signature": "uuid",
"network": "tor"
},
"foundAt": "string",
"pageTitle": "string",
"language": "en",
"detectionDate": "2019-08-24T14:15:22Z"
},
"tag": [],
"ransomwareGroup": "string",
"companyName": "string",
"companyAddress": "string",
"companyLink": "string",
"country": "string",
"naics": "string",
"industry": "string"
}
FieldTypeDescription
pageobjectPage information (see DocumentHit)
tagarrayAssociated tags
ransomwareGroupstringName of ransomware group
companyNamestringVictim company name
companyAddressstringCompany address
companyLinkstringCompany website
countrystringCountry code
naicsstringNAICS industry code
industrystringIndustry description

Used for statistics endpoints.

{
"key": "string",
"doc_count": 0
}
FieldTypeDescription
keystringCategory identifier
doc_countintegerNumber of occurrences

Used for aggregated statistics with sub-categories.

{
"key": "string",
"doc_count": 0,
"key_as_string": "string",
"agg": {
"buckets": [
{
"key": "string",
"doc_count": 0
}
]
}
}
FieldTypeDescription
keystringPrimary category key
doc_countintegerTotal count for category
key_as_stringstringHuman-readable key
agg.bucketsarraySub-category breakdowns

Used for Telegram and Discord message data.

{
"userId": 0,
"username": "string",
"channelId": 0,
"messageId": "string",
"message": "string",
"channelTitle": "string",
"languages": [
{
"language": "string",
"probability": 0.95
}
],
"sha1sum": "string",
"sha256sum": "string",
"media": "string",
"detectionDate": "2019-08-24T14:15:22Z",
"serverId": "string",
"serverTitle": "string",
"platform": "telegram"
}
FieldTypeDescription
userIdinteger/stringUser identifier
usernamestringUsername on platform
channelIdinteger/stringChannel/group identifier
messageIdstringUnique message ID
messagestringMessage content
channelTitlestringChannel/group name
languagesarrayDetected languages with confidence
sha1sumstringMessage content hash
sha256sumstringMessage content hash
mediastringMedia type if present
detectionDatestringISO 8601 timestamp
serverIdstringDiscord server ID (Discord only)
serverTitlestringDiscord server name (Discord only)
platformstringPlatform name

Used for user profile data.

{
"userId": 0,
"usernames": ["string"],
"firstName": ["string"],
"lastName": ["string"],
"detectionDate": "2019-08-24T14:15:22Z",
"profilePhoto": ["string"],
"platform": "telegram",
"email": [{"value": "string"}],
"telegram": [{"value": "string"}],
"whatsapp": [{"value": "string"}],
"bitcoin_address": [{"value": "string"}],
"ethereum_address": [{"value": "string"}],
"monero_address": [{"value": "string"}],
"ripple_address": [{"value": "string"}],
"zcash_address": [{"value": "string"}],
"polkadot_address": [{"value": "string"}]
}
FieldTypeDescription
userIdintegerUser identifier
usernamesarrayKnown usernames
firstNamearrayKnown first names
lastNamearrayKnown last names
detectionDatestringISO 8601 timestamp
profilePhotoarrayProfile photo URLs
platformstringPlatform name
emailarrayAssociated email addresses
telegramarrayTelegram handles
whatsapparrayWhatsApp numbers
*_addressarrayCryptocurrency addresses

Used for channel/group information.

{
"channelId": 0,
"channelTitles": ["string"],
"detectionDate": "2019-08-24T14:15:22Z",
"creationDate": "2019-08-24T14:15:22Z",
"channelPhoto": ["string"]
}
FieldTypeDescription
channelIdintegerChannel identifier
channelTitlesarrayKnown channel names
detectionDatestringWhen detected
creationDatestringChannel creation date
channelPhotoarrayChannel photo URLs

Used for Discord server information.

{
"serverId": 0,
"serverTitles": ["string"],
"detectionDate": "2019-08-24T14:15:22Z",
"creationDate": "2019-08-24T14:15:22Z",
"serverPhoto": ["string"],
"memberCount": 0,
"discordLink": ["string"]
}
FieldTypeDescription
serverIdintegerServer identifier
serverTitlesarrayKnown server names
detectionDatestringWhen detected
creationDatestringServer creation date
serverPhotoarrayServer icon URLs
memberCountintegerNumber of members
discordLinkarrayDiscord invite links

Used for leaked data from Telegram channels.

{
"id": "string",
"detectionDate": "2024-01-15T10:30:00Z",
"filePath": "leaked_database.sql",
"fileHash": "a3b2c1d4e5f6...",
"fileSize": 1024000,
"fileType": "sql",
"detectedMimeType": "text/plain",
"decompressedFilename": "leaked_data.pdf",
"archiveSource": "archive.zip",
"archiveMemberPath": "leaked_data.pdf",
"detectedInfo": {
"emails": ["user@example.com"],
"usernames": ["johndoe"],
"phone_numbers": ["+1234567890"],
"ipv4_addresses": ["192.168.1.1"],
"ipv6_addresses": ["2001:0db8::1"],
"bitcoin_addresses": ["1A1zP1eP..."],
"ethereum_addresses": ["0x742d35..."],
"hashes": ["a3b2c1d4e5f6..."]
},
"telegram": {
"telegram_id": "-1002104057089_108",
"channelId": -1002104057089,
"messageId": 108,
"channelName": "Data Leaks",
"channelUsername": "dataleaks"
},
"language": "en",
"languages": [
{
"language": "en",
"probability": 0.95
}
],
"parseStatus": "success",
"downloadUrl": "string",
"highlight": {
"detectedInfo.emails": ["<mark>user@example.com</mark>"],
"content": ["snippet 1", "snippet 2"]
}
}
FieldTypeRequiredDescription
idstringtrueUnique leak identifier
detectionDatestringtrueISO 8601 timestamp
filePathstringfalsePath to leaked file
fileHashstringfalseFile hash (SHA256/SHA1/MD5)
fileSizeintegerfalseFile size in bytes
fileTypestringfalseFile extension/type
detectedMimeTypestringfalseMIME type from file content
FieldTypeDescription
decompressedFilenamestringOriginal filename if from archive
archiveSourcestringParent archive filename
archiveMemberPathstringPath within archive

The detectedInfo object contains extracted entities:

FieldTypeDescription
emailsarrayEmail addresses found
usernamesarrayUsernames found
phone_numbersarrayPhone numbers found
ipv4_addressesarrayIPv4 addresses found
ipv6_addressesarrayIPv6 addresses found
bitcoin_addressesarrayBitcoin wallet addresses
ethereum_addressesarrayEthereum wallet addresses
monero_addressesarrayMonero wallet addresses
ripple_addressesarrayRipple wallet addresses
zcash_addressesarrayZcash wallet addresses
polkadot_addressesarrayPolkadot wallet addresses
binance_addressesarrayBinance Coin addresses
dash_addressesarrayDash wallet addresses
hashesarrayFile hashes found in content
FieldTypeDescription
telegram_idstringTelegram message identifier
channelIdintegerChannel ID
messageIdintegerMessage ID
channelNamestringChannel name
channelUsernamestringChannel username
FieldTypeWhen PresentDescription
downloadUrlstring/leak/{id} onlyPresigned S3 download URL
highlightobject/leak/search onlyHighlighted search matches
languagestringWhen detectedPrimary language code
languagesarrayWhen detectedLanguages with confidence
parseStatusstringAlwaysParsing status

Used in daily ransomware feeds.

{
"id": "string",
"ransomwareGroup": "string",
"companyName": "string",
"companyAddress": "string",
"companyLink": "string",
"country": "string",
"naics": "string",
"industry": "string",
"detectionDate": "2019-08-24T14:15:22Z"
}

Used in Telegram channel feeds.

{
"id": "string",
"telegram": ["string"],
"detectionDate": "2019-08-24T14:15:22Z",
"url": "string",
"path": "string",
"network": "string"
}
{
"language": "en",
"probability": 0.95
}
FieldTypeDescription
languagestringISO 639-1 language code
probabilitynumberConfidence score (0-1)
{
"value": "string"
}

Used for all cryptocurrency addresses and contact information.

{
"code": 400,
"message": "string"
}
FieldTypeDescription
codeintegerHTTP status code
messagestringError description
{
"detail": [
{
"loc": ["string"],
"msg": "string",
"type": "string"
}
]
}
FieldTypeDescription
detailarrayArray of validation errors
detail[].locarrayField location path
detail[].msgstringError message
detail[].typestringError type
ValueDescription
torTor hidden service
clearnetStandard internet

The API uses ISO 639-1 language codes. Common values include:

CodeLanguage
enEnglish
esSpanish
frFrench
deGerman
ruRussian
zhChinese
arArabic
ValueDescription
telegramTelegram messaging
discordDiscord messaging

All schemas are validated server-side. Common validation rules include:

  • Required fields: Must be present in the response
  • Type validation: Fields must match specified types
  • Format validation: Dates must be ISO 8601, UUIDs must be valid format
  • Range validation: Numeric fields may have min/max constraints
# Safe field access
def get_document_title(document_hit):
page = document_hit.get('page', {})
return page.get('pageTitle', 'Unknown Title')
# Validate data types
def validate_message_hit(hit):
required_fields = ['userId', 'channelId', 'messageId', 'detectionDate']
for field in required_fields:
if field not in hit:
raise ValueError(f"Missing required field: {field}")
from datetime import datetime
def parse_detection_date(date_string):
return datetime.fromisoformat(date_string.replace('Z', '+00:00'))