In my understanding, some people have donated their academic login credentials, ...

dotdi · on Oct 24, 2017

I will neither admit nor deny supplying my credentials to Sci-hub.

afandian · on Oct 24, 2017

I don't think you'll have to. Both hidden and conspicuous watermarks are common in trade book PDF downloads. RIAA members poisoned the Bit Torrent well.

I'm neutral on the matter, but there are some obvious paths of recourse for publishers who don't want people sharing their credentials.

(Disclaimer: I work for Crossref, which is an organisation in the scholarly publishing space.)

Typhon · on Oct 24, 2017

In any case, some researchers have claimed that Sci-Hub obtained their credentials through phishing. I don't know if it's true but in any case, such claims provide plausible deniability against that kind of watermarking proof, don't they ?

kanzure · on Oct 24, 2017

> Both hidden and conspicuous watermarks are common in trade book PDF downloads

https://github.com/kanzure/pdfparanoia

snowpanda · on Oct 24, 2017

Does only Crossref do this?

I'm asking because a friend of mine who's enrolled in 2 colleges, just downloaded the same scholarly PDF (one that's also available through sci-hub) with 2 different college credentials and they had the same sha256sum.

Then when he tried it through sci-hub, he also got the same sha256sum.

afandian · on Oct 24, 2017

Crossref doesn't do anything at all in relation to SciHub or watermarking PDFs. We don't host content. (I'm happy to talk about what we do do, but it's off-topic)

Sorry if my disclaimer wasn't clear. I was just pointing out affiliation as standard on HN.

auggierose · on Oct 24, 2017

Nah, that doesn't work for academic papers. You usually access them anonymously anyway through your institution (which has a subscription). So even if they watermark (and I don't think they do), they can only watermark the institution.

afandian · on Oct 24, 2017

There are plenty of instances of big-name publishers sending nastygrams to whole institutions and/or cutting them off for perceived abuse.

auggierose · on Oct 24, 2017

In that case the institution should unsubscribe and point their members to sci-hub.

arghwhat · on Oct 24, 2017

Keep up the good work.

dingoonline · on Oct 24, 2017

I'm still curious as to how this system works. Don't institutions have logs that would show someone researching topics wildly outside of their field of study?

college_library · on Oct 25, 2017

At my library (smallish US college):

We do keep logs for a period of time, but the library administration's policy is to only investigate them when a publisher contacts us with a claim of abuse (we do not proactively monitor for unusual activity, although we do take steps like limiting the number of concurrent sessions per user and blacklisting IP addresses/ranges with a history of suspicious activity).

The publisher generally supplies examples of timestamps and URLs that were part of the alleged abuse. We use that information to identify the "abusing" user in the log.

Usually there is pretty clear evidence that the user is not conducting legitimate personal research (e.g. the user is a freshman early childhood education major at the local rural branch campus, but they're downloading thousands of chemistry papers from an IP address in China or Russia). Typically the user does not seem like an information freedom warrior, or even to have a clue what is going on, so it seems most likely the credentials were phished.

stevemclaugh · on Oct 25, 2017

Thanks for this.

These cases may or may not be phishing. When corporations are hacked for their user credentials, those databases sometimes end up in dark web markets. It would be easy to extract email addresses with .edu domains ... so if a student used their university address for some service and reused the password, there's your login.

Moral of the story: Encourage students to use a password manager and 2FA.

stevemclaugh · on Oct 24, 2017

Academic librarians, who negotiate terms with publishers, are obsessed with privacy and academic freedom. Bless 'em. In theory, publishers don't know who's downloading what. The librarians I've talked to say they delete their logs daily, if not several times a day.

As for geographic location, academics tend to travel a lot. Last I heard (from reading the court docs in Elsevier's lawsuit against Sci-Hub), they stopped using proxy connections a few years ago. They just log in using stored credentials, grab an auth token that lasts X minutes/hours, and download articles from whatever IP is convenient.

dotdi · on Oct 24, 2017

Usually, institutions don't subscribe to everything a publisher has on offer, but rather a subset that is interesting for them. Therefore, if a download works with a set of credentials, it should not look suspicious.

Storing the documents on Sci-hub controlled machines also makes sure that repeated requests don't actually hit the publishers, which significantly lowers what would otherwise be suspiciously high traffic.

agumonkey · on Oct 24, 2017

storage relying on libgen, which I think has 200TB of data or more.