Helpful MySQL Queries

From GWAVA Technologies Training
Revision as of 14:45, 6 April 2017 by Stephanf (Talk | contribs)

Jump to: navigation, search

Daniel has created a number of helpful queries:

All Queries are for Retain 3.x and newer. MySQL syntax. In most instances MSSQL will work fine with the same query.

Warning: Some queries can take a heavy performance toll on the SQL server, I’ve tried to create them to be as efficient as possible but some things just take lots of processing/disk I/O time. Don’t expect these to always come back right away. ALSO, do not run a query unless you understand what you’re doing. If a query goes cartesian on you it will bring the SQL server to it’s knees and will probably ruin anything else that’s trying to access that server.

Additional note: Because google thinks they’re awesome copy and paste straight from this doc may not work. Copy the query to a notepad, or notepad++ and change the ‘text’ to actual ‘ (AKA “tick” in some circles) marks. Quotes also work.

User Information:

Contents

To get User (displaying email) with Last date they were cached, Most recently archived, and the Item Store flag:

SELECT abook_id, f_email, from_unixtime(ts_cache) AS 'Last Cached From Mail Server', from_unixtime(ts_store) AS 'Most recently Accessed', from_unixtime(ts_item) AS 'Item Store Flag'
FROM t_abook
GROUP BY abook_id;

SELECT f_first AS 'First Name', f_last AS 'Last Name', f_mailbox AS 'Mailbox ID', f_email AS 'Email Address'
    FROM t_abook
    INNER JOIN t_uuid_mapping ON t_abook.f_uid = t_uuid_mapping.f_uuid;
  
SELECT COUNT(*)
    FROM t_abook
    INNER JOIN t_uuid_mapping ON t_abook.f_uid = t_uuid_mapping.f_uuid;

Get A list of All users with a count of how many messages each person has:

SELECT abook_id, f_email AS 'Email Address', COUNT(DISTINCT(tm.message_id)) AS 'Total Message Count'
FROM t_message tm
INNER JOIN 
t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN 
t_abook ta ON ta.f_uid = tum.f_uuid
GROUP BY abook_id;
  • Note: The above query took 20ish minutes to run on a system with 8,000 users and 290 million messages. Number of users that had mail archived was actually nearly 20,000

Another thing of note is my use of abook_id in the SELECT and also the WHERE clause. I’ve found that not everyone has an email address. Sometimes Retain will archive something that doesn’t have an email attached to it. Hence “WHERE abook_id =” and “GROUP BY abook_id”

Total number of messages a particular user has:

SELECT COUNT(*)
FROM t_message
INNER JOIN t_uuid_mapping ON t_message.uuid_mapping_id = t_uuid_mapping.uid_mapping_id
INNER JOIN t_abook ON t_abook.f_uid = t_uuid_mapping.f_uuid
WHERE f_email = ‘user@place.net’;

IMPORTANT: It has been found that in Retain 4.0.3 that there are sometimes situations where there is data in the t_message table, but not in the t_message_attachments table. If you have nothing in the t_message_attachments table Retain will not be able to retrieve anything from disc.

A more accurate query for the number of messages Retain actually has would probably be:

SELECT COUNT(*)
FROM t_message
INNER JOIN t_uuid_mapping ON t_message.uuid_mapping_id = t_uuid_mapping.uid_mapping_id
INNER JOIN t_abook ON t_abook.f_uid = t_uuid_mapping.f_uuid
WHERE abook_id = {number};


Number of users with data:

Select count(*) AS 'Number of users with data'
From
(
SELECT
    COUNT(*)
  FROM t_message tm
    INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
    INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
    INNER JOIN t_domains td ON ta.domain_id = td.domain_id
  GROUP BY abook_id
)
AS t
;

Number of Users in Exchange/O365 with data (Works up until 4.1.0.1, 4.2 will change things):

Select count(*) AS 'Number of Exchange/O365 Users with archive data'
From
(
SELECT
    COUNT(*)
  FROM t_message tm
    INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
    INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
    INNER JOIN t_domains td ON ta.domain_id = td.domain_id
  WHERE td.f_type = 1
  GROUP BY abook_id
)
AS t
;


Get total (possible) size of message store used by a user.

Size is in Bytes (This size may not be entirely accurate, this is a number that has been passed by the Email server to Retain):

SELECT SUM(f_size) AS 'Approximate Storage Use'
FROM t_message_attachments tma
INNER JOIN t_message tm ON tma.message_id = tm.message_id
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
WHERE f_email = 'admin@knobias.com';


  • NOTE: The previous 2 queries could possibly be inaccurate if 2 users have the same Email address. The most accurate would probably be to use abook_id

In the WHERE clause you can use these fields that should be unique:

abook_id =
f_uid =
f_mailbox =
althash =

Most Unique Example:

SELECT abook_id AS 'Retain Address ID', f_email AS 'Email Address', SUM(f_size) AS 'Approximate Storage Use'
FROM t_message_attachments tma
INNER JOIN t_message tm ON tma.message_id = tm.message_id
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
WHERE abook_id = {abook_id number};

User ID (abook_id), mailbox name, email address and most recently stored message date:

(Thanks Daron)

SELECT abook_id AS 'User ID', ta.f_mailbox Mailbox, ta.f_email AS 'Eamil address', from_unixtime(max(tm.f_stored)) AS 'Most recently archived message'
  from t_message tm
  inner join t_abook ta on tm.uuid_mapping_id = ta.f_uid 
  GROUP BY abook_id;


Message counts and message store size:

Get A list of All users with a count of how many messages each person has along with the possible size that their archive would take on disk. This last number will not add up to the total disk use that Retain is using. Use it as a ballpark figure.

WARNING: The following query is VERY intense. On a system with 3,500 users with mail in Retain, and 41 million total messages, it took 1 hour 17 minutes to return. Use with caution.

SELECT 
f_email AS 'Email Address', 
FORMAT(COUNT(DISTINCT(tm.message_id)),0) AS 'Number of Messages', FORMAT(SUM(f_size),0) AS 'Approximate Storage Use (in Bytes)'
FROM t_message_attachments tma
INNER JOIN t_message tm ON tma.message_id = tm.message_id
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
GROUP BY abook_id
ORDER BY tm.message_id;

By default it’ll organize the results by the email address. If you’d like to organize it by size of the mailbox, smallest first, change the ORDER BY clause to read as follows:

ORDER BY SUM(f_size);

If you’d like it in reverse order tack on a “DESC” like so:

ORDER BY SUM(f_size) DESC;

If you want to organize it by number of messages, fewest first, tack on:

COUNT(DISTINCT(tm.message_id));

If you want to run a less intensive version: find specific users that the customer is interested in. Get the address book id and nail it down to that. On the system mentioned above the user with the most messages took only 35 seconds to complete.

Example:

SELECT f_email AS 'Email Address', FORMAT(COUNT(DISTINCT(tm.message_id)),0) AS 'Number of Messages', FORMAT(SUM(f_size),0) AS 'Approximate Storage Use (in Bytes)'
FROM t_message_attachments tma
INNER JOIN t_message tm ON tma.message_id = tm.message_id
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
WHERE abook_id = 1585;

Other Variations

SELECT 
CONCAT(ta.f_first, ' ', ta.f_last, '(',f_mailbox, ')') AS 'Mailbox Name',
f_email AS 'Email Address', 
FORMAT(COUNT(DISTINCT(tm.message_id)),0) AS 'Number of Messages',
FORMAT(SUM(f_size),0) AS 'Approximate Storage Use (in Bytes)',
FROM_UNIXTIME(MIN(tm.f_delivered)) AS 'Oldest Message',
FROM_UNIXTIME(MAX(tm.f_delivered)) AS 'Newest Message'
FROM t_message tm
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
GROUP BY abook_id
ORDER BY tm.message_id;

Example Return


Find User name, and most recently archived message

Below is a way to see everyone that’s older than 3 months. Just switch the > sign to a < sign to do 3 months and newer.

If you don’t care about if it’s older or younger than 3 months (for licensing) then just take out the “HAVING” line

SELECT
  CONCAT(ta.f_first, ' ', ta.f_last, '(',f_mailbox, ')') AS 'Mailbox Name',
  f_email AS 'Email Address',
  FROM_UNIXTIME(MAX(tm.f_delivered)) AS 'Newest Message'
  FROM t_message_attachments tma
  INNER JOIN t_message tm ON tma.message_id = tm.message_id
  INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
  INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
  GROUP BY abook_id
  HAVING MAX(tm.f_delivered) < UNIX_TIMESTAMP(DATE_ADD(CURDATE(), INTERVAL -3 MONTH))
  ORDER BY MAX(tm.f_delivered)
;

Example Return


Get email, number of messages that user has, Post office or server subsection they’re in, and the date of the most recent message that was archived

SELECT f_email AS 'E-mail', COUNT(*) AS 'Number of Messages in Retain', f_name AS 'Post Office', MAX(FROM_UNIXTIME(f_stored)) AS 'Most Recent Message Archived'
FROM t_message tm
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
INNER JOIN t_postoffices tp ON tp.postoffice_id = ta.postoffice_id
GROUP BY abook_id;

WARNING: This query is VERY intense, do not run this without the understanding that this will take a significant amount of time and resources on SQL. Small test system with 96 users and 4,553,000 items in t_message took 1 minute 7 seconds to run. By way of comparison: A straight count of t_message took only 4 seconds.

Get all message data for one user for messages delivered to the mailbox before X date:

SELECT *
  FROM t_message tm
  INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
  INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
  WHERE abook_id = 1
  AND tm.f_delivered < UNIX_TIMESTAMP(DATE_ADD(CURDATE(), INTERVAL -20 YEAR))
;

Getting Message data

SELECT f_name AS 'Attachment Name', hash AS "Hash ID" 
FROM t_document td
INNER JOIN t_message_attachments tma ON td.document_id = tma.document_id 
WHERE message_id = 614645;

The above is an expanded version of the query in this KB: http://support2.gwava.com/kb/?View=entry&EntryID=2420

Besides the HASH ID it will also give you the name of the attachment.

Going in reverse order, if you have the hash ID then you can find the message ID(s) that it connects to, I also threw in the date that it was initially stored on disc (Human readable according to the date of the server you’re running this on.):

SELECT message_id, FROM_UNIXTIME(date)
FROM t_message_attachments tma
INNER JOIN t_document td ON td.document_id = tma.document_id
WHERE hash = '{Your_hash}';

Example:

SELECT message_id AS 'Message ID', FROM_UNIXTIME(date) AS 'Date Stored'
FROM t_message_attachments tma
INNER JOIN t_document td ON td.document_id = tma.document_id
WHERE hash = '3249153B5525F19A46D28D29A4BE381CD2E8E22B8510C970328CF54464FAFACC';

To find if a particular hash was indexed and the message id:

SELECT 
  tm.message_id, f_indexed
FROM
  t_message tm
inner join 
  t_message_attachments tma
ON
  tm.message_id = tma.message_id
INNER JOIN
  t_document td
ON
  tma.document_id = td.document_id
WHERE
  hash = '{hash_number}';

Check to see if you have any accidentally deleted documents (might take a while to run)

This can be a result of a problem with the database and the count will come up how many deleted documents you have. You’ll have to go through a clean up process to fix this.:

SELECT COUNT(*)
  FROM t_message_attachments tma
  LEFT JOIN t_document td ON td.document_id = tma.document_id
  WHERE
  td.document_id IS NULL
;

XML MERGE FILE CREATION:

Possible way to get the GUID’s out of a database so that you can merge them through the UI:

SELECT
  CONCAT('<MergeTask><sourceMailbox>',
  gw.f_uid,
  '</sourceMailbox><destinationMailbox>', 
  ex.f_uid,
  '</destinationMailbox><removeOrphanedEntity>true</removeOrphanedEntity></MergeTask>'
  )
FROM
  t_abook gw,
  t_abook ex
WHERE 
  gw.domain_id = 2
  and ex.domain_id = 3
  and gw.f_email = ex.f_email
  ;

Note: you’ll probably need to figure out which domain is the old and which is the new so that you can appropriately assign the combination.

Copy and paste your results into an .xml file then tack on

<MergeTasks>
    <tasks>

To the top of the file and

    </tasks>
</MergeTasks>

To the bottom of the file

Instead of looking up the unix timestamps for colums like "f_delievered" etc you can use UNIX_TIMESTAMP('20100101') I.E. UNIX_TIMESTAMP('YYYYMMDD')

So this query works in MySQL:

SELECT COUNT(*) FROM t_message 
WHERE f_delivered < 
UNIX_TIMESTAMP('2009-10-24')
;

(You can include or exclude hyphens)

If you want a specific hour/min/sec you can use:

SELECT COUNT(*) FROM t_message 
WHERE f_delivered < 
UNIX_TIMESTAMP('2010-10-24 01:02:03')
;


As Requested in RET-7262

SELECT 
    abook_id AS 'Retain ID',
    td.f_name AS 'Mail Server or Domain',
    CONCAT(f_first, ', ', f_last) AS 'First and Last Name',
    f_mailbox AS 'Mailbox ID',
    f_email AS 'Email Address',
    COUNT(*) AS 'Number of Messages'
    FROM t_message tm
        INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
        INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
        INNER JOIN t_domains td ON ta.domain_id = td.domain_id
    WHERE f_stored < UNIX_TIMESTAMP(DATE_ADD(CURDATE(), INTERVAL -3 MONTH))
    GROUP BY abook_id
    ORDER BY COUNT(*)
;


MSSQL specific stuff:

The DATEADD feature is similar to the MySQL DATE_ADD feature.

SELECT abook_id, MIN(f_first) AS 'First Name', MIN(f_last) AS 'Last Name', MIN(f_mailbox) AS 'Mailbox ID', MIN(f_email) AS 'Email Address', COUNT(*) AS 'Number of Messages'
FROM t_message tm
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
GROUP BY abook_id
HAVING dateadd(S, MAX(f_stored), '1970-01-01') > '2017-02-01'
ORDER BY COUNT(*)
GO

Newer than 3 months:

SELECT ta.abook_id, MIN(ta.f_first) AS 'First Name', MIN(ta.f_last) AS 'Last Name', MIN(ta.f_mailbox) AS 'Mailbox ID', MIN(ta.f_email) AS 'Email Address', COUNT(*) AS 'Number of Messages'
FROM t_message tm
INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
GROUP BY abook_id
HAVING DATEADD(S, MIN(ts_cache), '1970-01-01') > DATEADD(m, -3, GETDATE())
ORDER BY COUNT(*);

Number of users with data:

Total Users with data:

Select count(*) 
    AS 'Number of users with data'
From
(
SELECT
    COUNT(*) Col1
  FROM t_message tm
    INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
    INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
    INNER JOIN t_domains td ON ta.domain_id = td.domain_id
  GROUP BY abook_id
)
AS t
GO

Number of Users in Exchange/O365 with data (Works up until 4.1.0.1, 4.2 will change things):

Select count(*) AS 'Number of Exchange/O365 Users with archive data'
From
(
SELECT
    COUNT(*) Col1
  FROM t_message tm
    INNER JOIN t_uuid_mapping tum ON tm.uuid_mapping_id = tum.uid_mapping_id
    INNER JOIN t_abook ta ON ta.f_uid = tum.f_uuid
    INNER JOIN t_domains td ON ta.domain_id = td.domain_id
  WHERE td.f_type = 1
  GROUP BY abook_id
)
AS t
GO
Personal tools
Namespaces

Variants
Actions
Home
Exchange
GroupWise
JAVA
Linux
MTK
Retain
GW Monitoring and Reporting (Redline)
GW Disaster Recovery (Reload)
GW Forensics (Reveal)
GWAVA
Secure Messaging Gateway
GW Mailbox Management (Vertigo)
Windows
Other
User Experience
Toolbox
Languages
Toolbox