четвер, 3 січня 2019 р.

CACHE4.DB FILE OF TELEGRAM FOR ANDROID (PART 3)


Maksym Boiko, mboiko25@gmail.com, Kyiv, 2018

CACHE4.DB FILE OF TELEGRAM FOR ANDROID (PART 1)

4. SEARCHING FOR INFORMATION IN THE CACHE4.DB DATABASE

 4.1. METHOD 1. SEARCHING FOR INFORMATION BY STRUCTURE OF DATABASE TABLES


This method is based on using standard queries for SQLite databases.
Step 1: Obtaining data from the messages table.
Step 2. Separating message-related information from the data field of the "messages" table, taking into account the structure described above.
Step 3. Obtaining the data from the userschatsenc_chats tables depending on the values ​​of the miduid fields from the messages” table.
Step 4. Defining user contacts from the user_contacts_v7user_phones_v7 tables (as an option, it is possible to find the data of individual interlocutors from the data field of the users table).

4.1.1. CONVERSION OF THE "UID" FIELD VALUE


mid value
uid value
Type
Conversion
mid > 0 and not a large digit
uid > 0 and its length is 9 characters or less
chat
uid without conversions
uid < 0 and its length is 9 characters or less
group
uid without conversions
mid > 0 and a large digit (16-digithexadecimal number)
uid < 0 and its length is10 characters
channel
uid without conversions
mid < 0
uid – 16-digit hexadecimal number
secret chat
uid with conversions
uid > 0 and its length is 9 characters or less
chat
uid without conversions
uid < 0 and its length is10 characters
channel
uid without conversions
uid < 0 and its length is 9 characters or less
group
uid without conversions

 Possible values ​​of the mid and uid fields:
- if mid and uid are positive  it is a regular chat; the interlocutor data can be find in the Users table by uid (usually the uid length is 9 characters or less)[3];
- if mid is a small positive digit[4] and uid is negative  it is a group, the data on this group can be find in the Chats table by uid (usually the uid length is 9 characters or less);
- if mid is a large positive digit[5] and uid is negative  it is a channel; the data on this channel can be find in the Chats table by uid (usually the uid length is 10 characters);
- if mid is negative and uid is 9-character length or less – the interlocutor’s data can be find in the Users table by uid;
- if mid is negative, uid is negative or positive and has a length of about 19 characters[6] – the interlocutor’s data can be find in the Enc_chats table by the converted uid” value. To find the appropriate contact in the Enc_chats table, you need to convert the decimal “uid” number to hexadecimal, then cut off the last 8 zeros from the received number and convert the eight-digit hexadecimal number back to decimal;
- if mid and uid are negative  it is a channel or group actions with joining/adding/removing another users; the data on this channel/group can be find in the Chats table by uid”.

4.1.2. TABLES RELATIONSHIPS


Relationships between cache4.db database tables in cases of chat between two users, a secret chat, a group and a channel are shown in the following figures 14, 15, 16, 17.

 Figure 14 Searching for information on an interlocutor in case of chat between two users or calls

Figure 15 Searching for information on an interlocutor in case of a secret chat

Figure 16 Searching for information on an interlocutor in case of group conversation

Figure 17 Searching for information on an interlocutor in case of a channel messages


4.2. METHOD 2. SEARCHING FOR INFORMATION BY INTERNAL STRUCTURE OF DATABASE FIELDS


The method is based on searching for values ​​of the data fields of the messagesuserschats, and enc_chats tables in the “cache4.db” file[7].
Note: In addition to the SQLite database file (cache4.db”), special interest should be given to the cache4.db-wal log file, which may contain data (about messages, contacts, etc.) either deleted from the main database "cache4.db " or has not yet been added to it. Due to the fact that the cache4.db-wal rollback file contains complete SQLite records from the main database file ("cache4.db"), this method is suitable for searching for message-related data in the "cache4.db-wal file.
Figure 18 shows an example of the cache4.db-wal rollback file with the SQLite database's messages entry, which has not yet been added in the cache4.db” file (the message body is in a rectangle, the signature of the data field is underlined).



Figure 18 Part of the cache4.db-wal file

Step 1: Searching by signatures for information on the “data fields of the “chatsenc_chatsusers tables, as well as their initial analysis (i.e., searching for data on chats, groups, channels, and users).
Step 2. Searching by signatures for information on the “data fields of the “messages table (three possible options), as well as their initial analysis (i.e. searching for message bodies and interpretation of another data).
Step 3. Analysis of the “data fields from the “messages" table  categorization of identified data (chat between two users, secret chat, group, channel, group creation, channel creation, call).
Step 4. Analysis of the “data fields from the “chatsenc_chatsusers tables  searching for matches between users id, chats id, their names and phone numbers.
Step 5. Searching for information on interlocutors by their unique identifiers among the data fields of the chatsenc_chatsusers” tables (depending on a category of the “data” field of the "messages" table).

DETAILED ALGORITHM:
Step 1: Searching by signatures for information on the data fields of the “chatsenc_chatsusers tables, as well as their initial analysis (i.e., searching for data on chats, groups, channels and users):
a) "c3 f4 13 2e" – initial 4 bytes of the "data" field with user details from the “users table (including the Telegram user);
b) "36 ce 56 fa" – initial 4 bytes of the "data" field with secret chat details from the "enc_chats" table;
c) "ac 74 89 c8" – initial 4 bytes of the "data" field with channel details;
d) "54 dd 1c d9", "db 8b 32 07", "54 dd 1c d9", "1c 4b b4 0c", "15 71 0b 45"[8] - initial 4 bytes of the "data" field with group details.

Interpretation of the “data fields of the chatsenc_chatsusers tables is given below in tabular form. Examples of extractions of information on users, groups, channels are shown in following figures 19, 20, 21, 22.

1)      User details from the “data field of the “users table (see the figure 19):

#
Offset
Lengthbytes
Description
1
0
4
= «cf4 13 2e»
2
8
4
User id (uid)
3
20
1
Size of user name
4
21
variable
User name (it can consist of some parts. For instance, ithe name is followed by the value 00, the next non-zero byte indicates the size of the user’s phone data. If the name is followed by non-zero character, then this   byte indicates the size of a next part of username)
5
After zero bytes
1
Size of user’s phone number
6
Next bytes
variable
User’s phone number

Figure 19 The “data field of the “users table for a chat between two users

2) User details from the “data field of the “enc_chats table (see the figure 20):

#
Offset
Lengthbytes
Description
1
0
4
= «36 ce 56 fa»
2
4
4
Chat id (uid)
3
24
4
User id (user)


Figure 20 The “data field of the “enc_chats table
  

3) User details from the “data field of the “chats table in case of a channel (see the figure 21):

#
Offset
Lengthbytes
Description
1
0
4
= «ac 74 89 c8»
2
8
4
Channel id (uid)
3
20
1
Size of channel’s name
4
21
variable
Channel’s name

Figure 21 The “data field of the “chats table in case of a channel

4) User details from the “data field of the “chats table in case of a group (see the figure 22):

#
Offset
Lengthbytes
Description
1
0
4
= «54 dd 1c d9», «db 8b 32 07»,
«1c 4b b4 0c», «15 71 0b 45»[9]
2
8
4
Group id (uid)
3
9
1
Size of group’s name
4
10
variable
Group’s name
 
Figure 22 The “data field of the “chats table in case of a group

Step 2. Searching by signatures for information on the data fields of the “messages table (three possible options), as well as their initial analysis:
a) «3d b4 f9 44» [10] - initial 4 bytes of a field that contains data on a regular chat (chat between two users, channel or group):
- value with offset 12 is equal to 32 e5 dd bd - data on messages in a channel;
- value with offset 16 is equal to bb e5 d0 ba - data on messages in a group;
- value with offset 16 is equal to 6d bc b1 9d - data on a chat between two users;
b) «f6 a1 19 9e» – initial 4 bytes of a field with either call or group’s/channel’s creation data;
- value with offset 12 or 16 is equal to 32 e5 dd bd – channel’s creation data;
- value with offset 16 is equal to bb e5 d0 ba – group’s creation data;
- value with offset 16 is equal to 6d bc b1 9d - calling data;
c) «fa 55 55 55» [11] - initial 4 bytes of a field with data on a secret chat (additionally  value with offset 20 is equal to 6d bc b1 9d).

Note on file transfer[12]:
In case of file sharing between users, the “data field of the “messages table contains information on file names, their sizes, etc. (after information described in the tables below).
So if you send a photo from the user's gallery, the first significant bytes is “d7 50 51 69” after message body (if any). There is data on file namessizes of the sent file and its cached images stored in the \storage\emulated\0\Android\data\org.telegram.messenger\cache[13] folder. This information is after a value “76 90 d6 53. Also the file path is incremented by one at the end of the data field.
In case of receiving a picture, a structure of data obtained on a received file and its cached images is similar in general. There may be several file markers (76 90 d6 53) which containing information about file namessizes of the file and its cached images stored in the “\media\0\telegram\telegram images\storage\emulated\0\Android\data\org.telegram.messenger\ cache, etc.
In case of sending a file from an explorer, the first significant bytes after the message metadata are d7 70 b0 9c. Then you can find information on file name and its size (in case of loading the last one) in the data field.

Interpretation of the “data fields of the "messages" table in cases of conversation between two users in regular chat, secret chat, channel, group, channels/groups creation, audio calls is given below in tabular form. 

1) Message details in case of chat between two users (from the data field of the messages table):

#
Offset
Lengthbytes
Description
1
0
4
= «3d bf9 44»
2
8
4
Message id (mid)
3
12
4
id (uid) of userthat sent a message
4
16
4
=«6d bc b1 9d»
5
20
4
interlocutor id
6
24
4
Unix Epoch timestamp of message
7
28
1
Size of a message (if value is not equal «fe»)
29
3
Size of a message (this fields are used if only value at offset 28 bytes is equal «fe»)
8
29 or 32
variable
Message body

2) Message details in case of a secret chat (from the data field of the messages table):

#
Offset
Lengthbytes
Description
1
0
4
= «fa 55 55 55»
2
8
4
Message id (mid)
3
16
4
id (uid) of userthat sent a message
4
20
4
=«6d bc b1 9d»
5
24
4
interlocutor id
6
28
4
Unix Epoch timestamp of message
7
32
1
Size of a message (if a value is not equal «fe»)
33
3
Size of a message (this fields are used if only a value at offset 32 bytes is equal «fe»)
8
33 or 36
variable
Message body

3) Message details in case of a channel (from the data field of the messages table)(items 6, 7 are taken into account in event of messages in a channel, items 8, 9 - in event of channel’s creation):

#
Offset
Lengthbytes
Description
1
0
4
= «3d bf9 44» – message
= «f6 a1 19 9e» – channel’s creation
2
8
4
Message id (mid)
3
12
4
= «32 e5 dd bd»
4
16
4
Channel id
5
20
4
Unix Epoch timestamp of message
6
24
1
Size of a message (if a value is not equal «fe»)
25
3
Size of a message (this fields are used if only a value at offset 24 bytes is equal «fe»)
7
25 or 28
variable
Message body
8
28
1
Size of channel’s name
9
29
variable
Channel’s name

4) Channel’s/group’s details in case of its creation by another user and subsequent joining/adding/removing of persons (from the data field of the messages table):

#
Offset
Lengthbytes
Description
1
0
4
= «f6 a1 19 9e» – channel/group-related action
2
8
4
Message id (mid)
3
12
4
Channel’s/group’s creator/administrator id
4
16
4
= «32 e5 dd bd»
5
20
4
Channel/group id
6
24
4
Unix Epoch timestamp of action
7
40
4
Id of joined/added/removed user

5) Message details in case of a group (from the data field of the messages table) (items 7, 8 are taken into account in event of messages in a group, items 9, 10 - in event of group’s creation):

#
Offset
Lengthbytes
Description
1
0
4
= «3d bf9 44» – message
= «f6 a1 19 9e» – group’s creation
2
8
4
Message id (mid)
3
12
4
id (uid) of userthat sent a message
4
16
4
bb e5 d0 ba»
5
20
4
Group id
6
24
4
Unix Epoch timestamp of message
7
28
1
Size of a message (if a value is not equal «fe»)
29
3
Size of a message (this fields are used if only a value at offset 28 bytes is equal «fe»)
8
29 or 32
variable
Message body
9
32
1
Size of group’s name
10
33
variable
Group’s name

6) Calling details (from the data field of the messages table):

#
Offset
Lengthbytes
Description
1
0
4
= «f6 a1 19 9e»
2
8
4
Message id (mid)
3
12
4
id (uid) of userthat made a call
4
16
4
=«6d bc b1 9d»
5
20
4
interlocutor id
6
24
4
Unix Epoch timestamp of calling
7
32
1
If a value is equal «0x03» – a connection is successful
If a value is equal «0x01» – a connection is not successful
8
48
4
Call duration (in case of successful connection only)

Step 3. Analysis of the data fields of the messages table  categorization of identified data (chat between two users, secret chat, group, channel, group’s/channel’s creation, call).

Interpretation of information from the “data fields of the "messages" table in cases of conversation between two users in regular chat, secret chat, channel, group, channels/groups creation, audio callsfile transfers is given below in tabular form. Examples of extractions of mentioned information are shown in following figures 23, 24, 25, 26, 27, 28, 29, 30.

1) Message details in case of a chat between two users (from the data field of the messages table) (see the figure 23):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «3d bf9 44»
2
8
4
0x00000003
3
3
12
4
0x26a3a99a
648259994
4
16
4
=«6d bc b1 9d»
5
20
4
0x26a3a99a
648259994
6
24
4
0x5bb1e543
1538385219 – 01.10.2018 12:13:39 UTC+3
7
28
1
0x21
33
8
29
33
I am fine. Is this a secret chat?
 
Figure 23 Message details in case of a chat between two users (from the data field of the messages table)

2) Message details in case of a secret chat (from the data field of the messages table) (see the figure 24):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «fa 55 55 55»
2
8
4
0xfffccba7
-210009 (0xfffffffffffccba7)
3
16
4
0x269dc727
647874343
4
20
4
=«6d bc b1 9d»
5
24
4
0x26a3a99a
648259994
6
28
4
0x5bb1e789
1538385801 – 01.10.2018 12:23:21 UTC+3
7
32
1
0x0c
12
8
33
12
Where is it?

Figure 24 Message details in case of a secret chat (from the data field of the messages table)

3) Message details in case of a channel (from the data field of the messages table) (see the figure 25):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «3d bf9 44»
2
8
4
0x00000003
5876046144655589379 (0x518be8d600000003)
3
12
4
= «32 e5 dd bd»
4
16
4
0x518be8d6
1368123606
5
20
4
0x5bb1e9ad
1538386349  – 01.10.2018 12:32:29 UTC+3
6
24
1
0x26
38
7
25
38
It is a channel about small town Sumy.
 
Figure 25 Message details in case of a channel (from the data field of the messages table)

4) Message details in case of a group (from the data field of the messages table) (see the figure 26):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «3d bf9 44»
2
8
4
0x00000016
22
3
12
4
0x09e5c49e
166053022
4
16
4
bb e5 d0 ba»
5
20
4
0x12f7659c
318203292
6
24
4
0x5bb1ead9
1538386649 – 01.10.2018 12:37:29 UTC+3
7
28
1
0x08
8
8
29
8
Ok. Nice

 Figure 26 Message details in case of a group (from the data field of the messages table)

5) Message details in case of a call (from the data field of the messages table) (see the figure 27):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «f6 a1 19 9e»
2
8
4
0x0000000d
13
3
12
4
0x269dc727
647874343
4
16
4
=«6d bc b1 9d»
5
20
4
0x26a3a99a
648259994
6
24
4
0x5bb1e703
1538385667 – 01.10.2018 12:21:07 UTC+3
7
32
1
0x03
Successful connection
8
48
4
0x00000011
Call duration – 17 seconds
 
Figure 27 Message details in case of a call (from the data field of the messages table)

6) File details in case of sending a photo from the user's gallery (from the data field of the messages table) (see the figure 28):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «fa 55 55 55»
Signature of a secret chat
2
44
4
= «d7 50 51 69»
3
96
4
= «76 90 d6 53»
 Beginning of data about file
4
104
4
0x80000000
The 1st part of file name –  2147483648
5
112
4
0xfffccbad
 The 2nd part of file name –              -210003 (=0xfffffffffffccbad)
6
132
4
0x000005bc
 File size – 1468 bytes
7
152
8
0x49919d6e000002b1
 The 1st part of file name – 5301191332163748529
8
160
4
0x22396b92
 The 2nd part of file name – 574188434
9
252
4
0x0001f89e
File size – 129182 bytes
10
272
1
0x55
Length of file path – 85 bytes
11
273
85
File path (file name is incremented by one)

Figure 28 File details in case of sending a photo from the user's gallery (from the data field of the messages table)

7) File details in case of receiving a photo (from the data field of the messages table) (see the figure 29):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «3d b4 f9 44»
Signature of a chat
2
32
4
= «d7 50 51 69»
3
228
4
= «76 90 d6 53»
Beginning of data on file
4
236
4
0x0eb7c820
The 1st part of file name – 246925344
5
244
4
0x00043f0a
 The 2nd part of file name – 278282
6
264
4
0x0003b2a5
File size – 242341 bytes

Figure 29 File details in case of receiving a photo (from the data field of the messages table)

8) Details on received file (in this case, a file is not loaded) (from the data field of the messages table) (see the figure 30):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «fa 55 55 55»
Signature of a secret chat
2
36
4
= «d7 70 b0 9c»
3
116
16
File name – «f454477n303.doc»


Figure 30 Details on received file (from the data field of the messages table).

Step 4: Interpretation of information from the “data fields of the chatsenc_chats and users tables.

Interpretation of information from “data fields of the chatsenc_chats and users” tables is given below in tabular form.  Examples of extractions of information on users, groups, channels are shown in following figures 31, 32, 33, 34.

1) User details from the data field of the users table (see the figure 31):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «c3 f4 13 2e»
2
8
4
0x26a3a99a
648259994
3
20
1
0x08
 8
4
21
8
Grygir27
5
32
1
0x0c
12
6
33
12
3806********
 
Figure 31 User details from the data field of the users table.

2) User details from the data field of the enc_chats table (see the figure 32):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «36 ce 56 fa»
2
4
4
0x6b0ed8be
1796135102
3
24
4
0x26a3a99a
648259994

Figure 32 User details from the data field of the enc_chats table

3) User details from the data field of the chats table in case of a channel (see the figure 33):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
= «ac 74 89 c8»
2
8
4
0x518be8d6
1368123606
3
20
1
0x08
8
4
21
8
Sumy1665

Figure 33 User details from the data field of the chats table in case of a channel

4) User details from the data field of the chats table in case of a group (see the figure 34):

#
Offset
Lengthbytes
Value
Meaning
1
0
4
«54 dd 1c d9», «db 8b 32 07»,
«1c 4b b4 0c», «15 71 0b 45»[14]
2
8
4
0x12f7659c
318203292
3
12
1
0x0d
13
4
13
13
Volleyball234

Figure 34 User details from the data field of the "chats" table in case of a group

Step 5. Searching for information on interlocutors by their unique identifiers among the data fields of the chatsenc_chatsusers tables (depending on the data field of the messages table). At the same time data on names of groups/channels can be found in the “data” field of “messages” table for cases of group’s/channel’s creation.