The Best Articles I Read in 2018

Those who know me best know that I enjoy two things above all else: Reading and recommending articles (and through a combination of luck and stubbornness, have managed to build a business peripheral…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Modeling Superhero Alignment

The superhero universe of Marvel and DC comics is composed of a vast, amalgamated character data set of constructed characters. With the help of fan-edited scorecards of superhero stats (intelligence, strength, speed, durability, combat) and a large labeling of superhero powers, this study searches for profile insights of ‘good’ vs ‘bad’ aligned characters.

Question: What might be distinguishing superpower qualities of ‘good’ vs ‘bad’ types? If you’re not familiar with the comics, pause and consider the latest well-known superhero film series: “X-Men” and “Avengers”. Both commonly feature a most powerful supervillain supported by a small cadre of villains whose plans are thwarted by a larger coordinated coalition of good guys. Alternatively, in individual superhero films like “Spider-Man” and “Batman”, you might have a well-rounded good guy go up against a series of low-dimensionally-skilled bad types. How might these impressions from popular superhero films bear out in the broader comic book data?

Bad Guys: Marvel (left) and DC Comics (right)
Total superpower stats of ‘bad’ guys are greater than ‘good’ guys

Gathering a feel for the stat data, we see that ‘bad’ superheroes are on average more powerful than ‘good’ superheroes.

Additionally, ‘bad’ superheroes contain a larger population of “very strong” profiles that need to be defeated collectively by the ‘good’ guys. Further, the ‘bad’ superheroes have a low population of “weak” types - they shouldn’t be too easy to be defeated.

Scatter plot Intelligence vs Strength

Individual stat categories also show a uniform distribution and not heavily correlated with other stat categories.

If we further include superpower labeling and impute the mean for characters with missing stats, we can build a model to seek insights. Initially, we perform a random forest model and obtain feature importance for targeting ‘Alignment’.

Random Forest Feature Performance
Random Forest Model Accuracy = 72%

The stat categories dominate Random Forest split importance for our model accuracy. If we improve our model with XGBoost, we benefit from hyper-tuning of our feature parameters.

XGBoost Model Accuracy = 74%

Permutation importance has high error.

Now, examining Partial Dependence Plots reveals the accuracy of superhero impression discussed earlier from films.

PDP for Total superpower stats

As seen earlier from ‘good’ vs ‘bad’ superhero distributions, the “very strong” characters weight the predictions toward ‘bad’.

PDP for Intelligence

Moderate individual stats push a character towards ‘good’ until too much individual power reverses the trend towards ‘bad’.

Add a comment

Related posts:

Cannabis Increasing Pain?

Plenty of evidence is out there demonstrating the therapeutic potential of cannabinoids in the management of difficult to treat pain, even to synergize with a concomitant opioid medication regime…

A Rare Kind of Awesome

Every human being on the planet has a deep seated desire to be “known”. A craving to be understood and accepted. Let me introduce you to one of the most life-changing pleasures of prayer — the fact…

5 Surprisingly Simple Techniques That Will Make Procrastination Impossible

Last week I got an assignment that was due yesterday. Before I got this assignment, I had decided that I’ll be completing my assignments the day I’m assigned one. But I kept on saying to myself…