<html>
<head>...</head>
<body class="css" style="height: 100%;">
<h1 class="pull-left" style="margin: 0px 0px 6px 6p;">Chat Archive</h1>
<a href="..." style="display:inline-block;padding:10px;">Return to Details Page</a>
<div class="clear"></div>
<div style="margin:10px";>...</div>
<div>...</div>
<div class="clear"></div>
<div class="textchatcontainer" id="textchat">...</div>
...<div class="textchatcontainer" id="textchat">...</div>Roll20 Aggregator is a web app that parses the chat log of a Roll20 game to display aggregate statistics and answers questions like, who wrote the most messages? Who rolled the most most 1s? Who was the luckiest? Were the virtual dice fair?
More fully, the following features are available upon uploading a chat log:
Roll20 Aggregator was built using the following technologies:
This site does not store any user data.
This site is not affiliated with Roll20.
The parser works by checking the HTML of each .message div and parsing it for certain classes and attributes that identify rolls.
There are two chief classes of message to look out for: .rollresult, which recodesents a roll block (the more
pictographic result you get if you were to type /r d20); and .diceroll, which recodesents an inline roll
(the more compact, textual result you get if you were to type [[d20]]).
In a roll block, the relevant roll information can be obtained from the .dicegrouping div. In the below example,
we can see from the .diceroll div that a d6 was rolled, and from the .didroll div that a 3 was the raw roll.
Importantly, this the value that was rolled, before any modifiers were applied.
<div data-origindex="0" class="diceroll d6">
<div class="dicon">
<div class="didroll">3 </div>
<div class="backing"> </div>
</div>
</div>
Things are a little more messy for inline rolls. In the below example, we can see that all the roll information is contained in
the original-title attribute of the .inlinerollresult span. The die type can be parsed from the
1d100 string, and the raw result can be found within the .basicdiceroll span. Note again that in this case
we are interested in the 49 that was rolled, not the resulting 40 from a -9 being applied.
<span class="inlinerollresult showtip tipsy-n-right"
original-title="Rolling 1d100 - 9 = (<span class="basicdiceroll">49</span>)-9">
40
</span>
Note that messages may contain multiple rolls, and the two types of rolls may rarely be combined: a roll block containing an inline roll.
Using the above methods, we can associate a die roll with the author of the message, which can typically be found in the
.by span. This span will not be present in consecutive messages by the same user, so if not present, the message
is associated with the previous author that was identified.
Further complications to identifying authorship arise through emote message - that is, messages of the following format:
August the Second shoots a fireball, dealing [[1d20]] damage.
These messages do not have any HTML information to uniquely identify the author, so we must use the text content itself, which begins with the character name. However, while as a human we can read the above and understand that the character is named August the Second, the parser has no immediate way to know which words should be included in the character name.
The parser attempts to get around this using the following strategy:
.avatar div)
to character names. Note that to our advantage, users will by default likely have different avatars.
The fundamental concept used across dice analysis is the average result of the die. This is equal to (S + 1) / 2,
where S is the number of sides. On a six-sided die, or d6, for instance, the average roll would be (6 + 1) / 2 = 3.5.
A 3.5 can never actually be rolled in a single roll of the d6, of course, but over many rolls, the average roll will approach
this value if the d6 is truly random.
The above formula can be extended to rolling multiple dice. For example, if you roll two six-sided dice, 2d6, the average result
would be 2(3.5) = 7. More generally, the average result for a dice roll can be said to be N(S + 1) / 2,
where N is the number of dice and S is the number of sides of the die.
Using only these concepts, we can sum up all of the dice rolled across a campaign by a hypothetical character named Tim and calculate the average result of all these dice to determine whether Tim rolled above or below average.
When it comes to comparing the rolls of two different characters, though, we need a way to express how far from the average the result is, in a way that doesn't depend on the number of dice rolled or what types of dice they were. Otherwise, it wouldn't make sense: a character rolling only d100s will of course have a higher average than a character rolling only d6s.
This can be done using the Z score statistic, which does precisely that. It expresses the distance from the mean in terms of standard deviation, a normalized measure of how spread apart data is around its mean.
Calculating standard deviation for individual dice rolls and combining them into a single pooled standard deviation is beyond the scope of this explanation. For more reading on this topic, check out this article on dice variance by Analytics Check and this article on pooled standard deviation by Statistics How To.
Since the Z scores are normalized, they can be compared between characters. A character with a higher Z score will have on average rolled higher, and a character with a lower Z score will have on average rolled lower.
Calculating the fairness of a die may require more of a statistics lecture than you might want to read. In summary, the goal is to determine the probability that the rolls you have observed have happened given a truly random die. For example, if you roll a six-sided die 600 times, you would expect that each face would be rolled roughly 100 times. If instead you roll a one 300 times, you might suspect that the die isn't being fair.
This idea of comparing observed and expected frequency is precisely what can be calculated with a chi square test. The higher the resulting chi square statistic of this test, the lower the chance that you would see your results with a truly random die.
Specifically, the test produces a p value that represents how likely your results are given the dice being random. A p value of 0.10, for instance, would mean that there's a 10% chance of rolling the results you got if the dice are fair.
There is no set threshold at which one can objectively say a die is unfair. In statistics, we more or less arbitrarily choose what is known as the critical value, a p value beyond which the results are said to be significant. One commonly used critical value is 0.05, and this is what is used by the aggregator.
That is to say, if the calculated p value of your rolls is 0.05 or lower, we reject the assumption that your dice are truly random. If the p value is greater than 0.05, we cannot however conclude that the dice are random. We can make no conclusions - we have simply not found a significant effect.
For more reading on this topic, check out this post by Ilmari Karonen.
Although care was taken in allowing the aggregator to parse as many rolls as possible as accurately as possible, there are known limitations to the parser and this site as a whole: