Paralyze Testing

**Kenji** · 2008-06-12 15:49

Originally Posted by ambutter

If I flipped a quarter 10 times, and it came up heads 6/10 times... then I flipped a dime 10 times, and it came up heads 4/10 times... can I clearly conclude, based on my tests, that a quarter is more likely to come up heads than a dime is?

No, because you're using a bad analogy. A better one would be that you're taking two coins and flipping them. And one coin always comes up heads, while the other always comes up tails. You don't need to have a phd to realize that you have an interesting scenario here.

@Nekio

I was gonna shoot you down, but more than a few people already did. Current testing has shown Para II to be vastly better than Para I. If you disagree, then the onus is on you to conduct testing that shows otherwise. You remind me of an armchair statistician, spouting out the big words he learned in class, without really trying to grasp why some of them don't apply here.

Lol t-test.

**ambutter** · 2008-06-12 16:31

Originally Posted by Kenji

Originally Posted by ambutter

If I flipped a quarter 10 times, and it came up heads 6/10 times... then I flipped a dime 10 times, and it came up heads 4/10 times... can I clearly conclude, based on my tests, that a quarter is more likely to come up heads than a dime is?

No, because you're using a bad analogy. A better one would be that you're taking two coins and flipping them. And one coin always comes up heads, while the other always comes up tails. You don't need to have a phd to realize that you have an interesting scenario here.

@Nekio

I was gonna shoot you down, but more than a few people already did. Current testing has shown Para II to be vastly better than Para I. If you disagree, then the onus is on you to conduct testing that shows otherwise. You remind me of an armchair statistician, spouting out the big words he learned in class, without really trying to grasp why some of them don't apply here.

Lol t-test.

/sigh... not trying to get into an e-pissing match... but it seems you're missing the point. My analogy was meant to point out the necessity for a reasonable number of tests to prove something statistically. Ten is not a reasonable number.

To your second point... I'm not under the impression that Nekio thinks Para II isn't "vastly better" than Para I, only that such a small number of tests isn't sufficient to say that anything is definitively true or false. You can argue all you want, but no statistician, nor anyone that has a sound understanding of statistics, will agree that the results from the OP offer any statistically definitive conclusions.

Is Para II better than Para I? Of course it is. No one is arguing that Para I is superior. No one will argue with you if you say that Para II is better than Para I. Some are merely pointing out that the small data set presented isn't enough to statistically back up that claim.

**ringthree** · 2008-06-12 16:39

Originally Posted by ambutter

Originally Posted by Kenji

Originally Posted by ambutter

If I flipped a quarter 10 times, and it came up heads 6/10 times... then I flipped a dime 10 times, and it came up heads 4/10 times... can I clearly conclude, based on my tests, that a quarter is more likely to come up heads than a dime is?

No, because you're using a bad analogy. A better one would be that you're taking two coins and flipping them. And one coin always comes up heads, while the other always comes up tails. You don't need to have a phd to realize that you have an interesting scenario here.

@Nekio

I was gonna shoot you down, but more than a few people already did. Current testing has shown Para II to be vastly better than Para I. If you disagree, then the onus is on you to conduct testing that shows otherwise. You remind me of an armchair statistician, spouting out the big words he learned in class, without really trying to grasp why some of them don't apply here.

Lol t-test.

/sigh... not trying to get into an e-pissing match... but it seems you're missing the point. My analogy was meant to point out the necessity for a reasonable number of tests to prove something statistically. Ten is not a reasonable number.

To your second point... I'm not under the impression that Nekio thinks Para II isn't "vastly better" than Para I, only that such a small number of tests isn't sufficient to say that anything is definitively true or false. You can argue all you want, but no statistician, nor anyone that has a sound understanding of statistics, will agree that the results from the OP offer any statistically definitive conclusions.

Is Para II better than Para I? Of course it is. No one is arguing that Para I is superior. No one will argue with you if you say that Para II is better than Para I. Some are merely pointing out that the small data set presented isn't enough to statistically back up that claim.

Nick pick! It wasn't ten tests, every attack round is a data point, or a test. This is why a percentage isn't as good information as # of procs vs. # of total attack rounds.

**Kirschy** · 2008-06-12 16:44

Originally Posted by Callisto

Originally Posted by Kirschy

As Belkin already said, this is the start to further testing. I myself am very curious to see the results of testing /w Para vs Para II at dMND=0.

This might actually be difficult to do, lol. I have 70+ MND naked, the mobs that I can actually test on without getting my head kicked in will likely have a chunk less than that, I'll have find a Spike Necklace and some other MND- crap. I'll try to get a MND value on Robber Crabs and start w/ testing on them if I get no PLD invites tonight.

Edit: I actually can't think of any other easily obtainable MND- gear for RDM off the top of my head, lol. Any suggestions?

If you're able to use another character to secure the MND value, you can sj a lv 1 and lower your MND. Spike Necklace as you mentioned works well. When I did my tests on crabs, they always had 66 MND. I used the same crab for most testing, so I don't know if all robber crabs are 66 MND or I just got lucky.

**Callisto** · 2008-06-12 16:48

Originally Posted by Kirschy

If you're able to use another character to secure the MND value, you can sj a lv 1 and lower your MND. Spike Necklace as you mentioned works well. When I did my tests on crabs, they always had 66 MND. I used the same crab for most testing, so I don't know if all robber crabs are 66 MND or I just got lucky.

Thanks, I wasn't sure what the level variance was on Robbers so I was thinking about just doing EM Steelshells as RDM/PLD and bringing my DRG/mage alt just in case. I have LS mates trying to round me up a Forest Belt and Malflood at the moment, when Einherjar lets out tonight I'll head out to the Tree and try to get some #'s.

Note: I have level 2 Para II so someone else will have to get values for leve 1, but I'll post the Steelshell MND stats and whatnot when I have the info.

Nekio · 2008-06-12 16:49

All right, fine. Let's do this.

Firstly, no we do not know the number of attack rounds for each casting. However, we do know that this test was done on low level mobs, and that the duration of Paralyze is not particularly variable at that level difference anecdotally. It is an assumption to claim that the number of attack rounds is equal between the different casts, but one that is already made if you even want to talk about "average" proc rates for Para I vs. Para II with these tests. You can't have your cake and eat it too: either the averages are accurate or they're not. We're all making the assumption that the averages are valid and are representative of a typical number of attack rounds, otherwise this debate wouldn't even be occurring, because any comparisons between the data sets would immediately be voided. For the same reason, we have to assume a normal distribution if we want to compare means in this setting.

So, assuming that the percentages for each data point represent an approximately equal number of attack rounds, we come up with an average proc rate of 31.0 for Para II and 17.5 for Para I. The sample size is 5 for Para II and 2 for Para I.

The variance of the samples for Para II is 246, which is enormous. It's obviously much smaller for Para I.

Using these values, we get a t-value of 0.123 (I originally stated 0.073, but originally forgot to adjust the equation for unequal variance... either way it's irrelevant). I'm not sure where the 6.077 value came from. The correct equation is:

t = (mean1 - mean2) / ( sqrt( var1^2/n1 + var2^2/n2)

A t-value of 0.123 gives an approximate probability of the two means being different of less than 75% (much less, probably closer to 50%).

I hate when people try to claim credentials on the intarwebz, but I'll go ahead and make it clear that I've been in graduate school studying trends in biological data for 6 years and have studied statistics during that time and about the past 4 years prior. I'm by no means a statistician, but I certainly have enough experience in the basics of stats (which this is... very basic) to apply the calculations to a data set and to know that a sample size of 2 is pretty much ALWAYS useless.

*EDIT* Meh too slow. My points still stand. Without making further assumptions about the quality of the data, we cannot make any accurate conclusions because we. need. more. results.

**Callisto** · 2008-06-12 16:54

All the statistics arguments are fascinating, truly, but how about you guys go cast Paralyze on some shit and contribute or STFU?

Nekio · 2008-06-12 16:59

Originally Posted by Callisto

All the statistics arguments are fascinating, truly, but how about you guys go cast Paralyze on some shit and contribute or STFU?

Because data is useless without proper interpretation. I've provided interpretation. Which have you provided, data or interpretation?

**Callisto** · 2008-06-12 17:00

Originally Posted by Nekio

Originally Posted by Callisto

All the statistics arguments are fascinating, truly, but how about you guys go cast Paralyze on some shit and contribute or STFU?

Because data is useless without proper interpretation. I've provided interpretation. Which have you provided, data or interpretation?

I've already stated that I will be providing data as soon as I'm not sitting in a desk at work, while half the posts are just mindless math-peen swinging.

**Snapples** · 2008-06-12 17:22

so basically what the general idea is: if you have an awesome mnd setup, dont merit slow2 or para2 beyond lvl 2 or 3 because you can land it at its capped value. if you DONT have a big mnd setup, merit the spell higher which gives you higher magic acc and potency (effectively higher mnd according to the formula). someone please correct me if ive made a drastic mistake here, and yes im making a generalized statement.

**Callisto** · 2008-06-12 17:30

Originally Posted by Snapples

so basically what the general idea is: if you have an awesome mnd setup, dont merit slow2 or para2 beyond lvl 2 or 3 because you can land it at its capped value. if you DONT have a big mnd setup, merit the spell higher which gives you higher magic acc and potency (effectively higher mnd according to the formula). someone please correct me if ive made a drastic mistake here, and yes im making a generalized statement.

I probably wouldn't say it quite like that until we can discern how much MND is needed, mostly b/c if you have a full-on MND build you're likely sacrificing a fair amount of Enfeebling Skill to do so, meaning you may need the Magic Acc more than someone with less MND but more skill.

**Olo** · 2008-06-12 17:40

Originally Posted by Nekio

So, assuming that the percentages for each data point represent an approximately equal number of attack rounds, we come up with an average proc rate of 31.0 for Para II and 17.5 for Para I. The sample size is 5 for Para II and 2 for Para I.

Do you even play this game?

You are both a jackass and stupid.

The sample size is not 2 you fucking idiot. Every single attack round contributes to sample size and you have not been given the attack round data, only an approximation... so you can't even properly analyze it... DICKHEAD

**Kenji** · 2008-06-12 17:41

Originally Posted by ambutter

/sigh... not trying to get into an e-pissing match... but it seems you're missing the point. My analogy was meant to point out the necessity for a reasonable number of tests to prove something statistically. Ten is not a reasonable number.

Except, as Ring pointed out, we're not talking about ten tests. We're talking about a number larger than that, which is why all this talk about "10" is dumb as hell. We have a fairly decent sample size now. All a larger sample size will do, is narrow down the exact Percentage that Para II is better than Para I.

That's the point you're missing.

Originally Posted by Nekio

Without making further assumptions about the quality of the data, we cannot make any accurate conclusions because we. need. more. results.

Calling into question the OP's honesty? And sure, we could always use more results. How much? 100 tests? 1000? 1 million? Who has that kind of time?

Science is the process to find the truth. Scientific theories stand until dis-proven. We have a theory here. If you think it's not truthful, then disprove it.

I released a ball 100 times. In all cases, it fell straight down and bounced a bit. Nekio sez: WE NEED MORE TESTS! We can't draw any accurate conclusions!

Guess it's true what they say about lies, damn lies, and statistics. They tend to lose sight of what science truly is. Your "interpretation" could fit any of the three.

**Snapples** · 2008-06-12 18:00

ugh this thread has been totally derailed by people missing the point and fighting over statistic crap

Nekio · 2008-06-12 18:03

Originally Posted by Olo

The sample size is not 2 you fucking idiot. Every single attack round contributes to sample size and you have not been given the attack round data, only an approximation... so you can't even properly analyze it... DICKHEAD

Originally Posted by Kenji

Except, as Ring pointed out, we're not talking about ten tests. We're talking about a number larger than that, which is why all this talk about "10" is dumb as hell. We have a fairly decent sample size now. All a larger sample size will do, is narrow down the exact Percentage that Para II is better than Para I.

If you haven't been trained in stats, please STFU and stop trying to interpret the data, because you honestly have no fucking clue what you're talking about.

At any rate, I'm done arguing with morons. Obviously, we need more testing before anything concrete can be said. Has it ever actually been shown that there's a hard cap on Para proc rate? That's a pretty important start. I have a suspicion that MND plays a pretty big role in proc rate when the level difference isn't so large, based on anecdotal experience. It's gonna be tough to quantify that without establishing a true cap.

**Kirschy** · 2008-06-12 18:06

Originally Posted by Snapples

so basically what the general idea is: if you have an awesome mnd setup, dont merit slow2 or para2 beyond lvl 2 or 3 because you can land it at its capped value. if you DONT have a big mnd setup, merit the spell higher which gives you higher magic acc and potency (effectively higher mnd according to the formula). someone please correct me if ive made a drastic mistake here, and yes im making a generalized statement.

There was a clear difference in caps using Slow II /w 2 Merit and Slow II /w 3 merits in my testing. Testing /w a 4th and 5th merit hasn't been done yet, but there will probably be a noticable increase. These kinds of tests /w Paralyze probably won't provide precise data on how much increase is obtained /w further merits.

**ringthree** · 2008-06-12 18:07

Originally Posted by Nekio

*EDIT* Meh too slow. My points still stand. Without making further assumptions about the quality of the data, we cannot make any accurate conclusions because we. need. more. results.

You can say that infinitely, which is why you interpret the data you have and continue to build your data pool to improve that interpretation. What you are saying is self-refuting. We can start with the data we have, make a tentative conclusion, add more data and improve the conclusion. We don't have to limit the data pool because we have a paper to turn in or an article that needs to be published.

Nekio · 2008-06-12 18:11

Originally Posted by Ringthree

You can say that infinitely, which is why you interpret the data you have and continue to build your data pool to improve that interpretation. What you are saying is self-refuting. We can start with the data we have, make a tentative conclusion, add more data and improve the conclusion. We don't have to limit the data pool because we have a paper to turn in or an article that needs to be published.

As I said before, I'm not questioning the test or the data, just the conclusions people are making from it. The test is well-done and the data is sound. We only need to collect more data until we are able to show a (statistically) significant difference.

**Olo** · 2008-06-12 18:49

Originally Posted by Nekio

If you haven't been trained in stats, please STFU and stop trying to interpret the data, because you honestly have no fucking clue what you're talking about.

LOL.. you.. are telling me... to "stop trying to interpret the data".... thats grand.

NEWS FLASH!!: I haven't tried to interpret the data because there is no data.... no data has been provided... only tentative conclusions based on data that we can't see.

You, on the other hand, have been trying to interpret this mystery data and have fallen flat on your face.

**Sargas** · 2008-06-12 19:08

Originally Posted by Nekio

So, assuming that the percentages for each data point represent an approximately equal number of attack rounds, we come up with an average proc rate of 31.0 for Para II and 17.5 for Para I. The sample size is 5 for Para II and 2 for Para I.

The variance of the samples for Para II is 246, which is enormous. It's obviously much smaller for Para I.

Using these values, we get a t-value of 0.123 (I originally stated 0.073, but originally forgot to adjust the equation for unequal variance... either way it's irrelevant). I'm not sure where the 6.077 value came from. The correct equation is:

t = (mean1 - mean2) / ( sqrt( var1^2/n1 + var2^2/n2)

A t-value of 0.123 gives an approximate probability of the two means being different of less than 75% (much less, probably closer to 50%).

Well first of all I was using a sample size of 10 for Para2 and 5 for Para1, assuming that MND made no difference in proc rate and so using all the tests (which, yes, is a very nontrivial assumption). However if I reduce the sample size to 5 and 2 for the tests done with no gear, I still don't get your numbers; I get a t-value of 3.81, which is still significant.

The formula I'm using from wiki is t = (mean1 - mean2) / sqrt(var1/n1 + var2/n2). And the variance of the samples for Para2 isn't 246, it's 246/(5-1) = 61.5. Those numbers give me the 3.81 figure.

Thread: Paralyze Testing

Thread Tools

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Re: Paralyze Testing

Similar Threads

Resist Paralyze trait testing

Extensive Paralyze I Testing

Banish III testing

Latest Threads

Up & Coming Threads

Hottest Threads