I encountered this question during one of my quiz i had earlier this morning and im not sure what are the answers.

Question Options
1. L1 regression and Singular value decomposition
2. Regularied regression and Omitting columns
3. Principal Component Analysis and Kmeans Clustering
4. Normalize column values and omitting columns
5. Regularized regression and L1 regression
6. Normalized column values and Singular value decomposition

I’d like to view a collaboration graph between two arbitrary authors.

For instance, the Erdos Number famously describes the distance between an arbitrary author and Paul Erdos, measured by authorship of mathematical papers.

I’d like to find out the shortest path or the overall strength of ties between any two authors, in a graph of authors and academic publications where the nodes are authors and co-authorship is an edge.

http://pubnet.gersteinlab.org/ purports to generate a map like this, but unfortunately it doesn’t appear to be working for me.

Is there any way to do this?

I have gathered data through the Twitter Search as well as the Streaming API. I wanted to know how screen scraping data from Twitter (20 tweets at a time till the whole page is scrolled) differs from Search & Streaming API’s?

Search API only goes back 7 days, whereas scraping lets us go back quite a bit. Does scraping provide us all the data or is that also about 1% of the Firehose?

I have been trying to search for this information but have not been successful in finding anything.

Any help would be appreciated.

Publication bias. Reproducibility problem. Abusing statistical tests.

These are some of the many criticisms received by all fields of science for a long time. If I read an article on Psychological Science and am sceptical of their results, or if I want to apply another statistical techniques to see if the results remain convincing to me, I can’t. I need to run another experiment. Or if I want to conduct a meta-analysis, maybe having other researcher’s raw data is better than just the mean/CI they report in journals.

If scientists’ mission is for public good and for the advancement of knowledge, why don’t they publish their results in raw (of course they need to remove research participants’ privacy information). They shouldn’t be afraid of others’ criticising their work. Only truth can endure the testing of time.

Nowadays, with the prevalence (and low price) of online storage platform and sophisticated database management, why don’t they do it for the public’s good?

EDIT: by raw data, I mean to make the dataset public and accessible to everyone (well… at least researchers)

I am really new to writing a research paper and don’t have a professor with whom I can discuss this. I have read a paper on study of drug-drug interaction and a particular Machine Learning approach to create some results. I want to extend the paper to get results on drug-gene interaction and drug-drug-gene interaction. The dataset would be different but the essential algorithm would not be very different. Are these findings enough to constitute a new paper or should the paper have a new approach towards the dataset too? I have recently completed my undergrad and want to write a paper. But as I said, I dont have any advisor.

Institute II is setting up a benchmark project under the lead of main-author MM. 20 other developers participate, run their methods without having access to ground truth data and also MM contributes with his methods results.

A paper is written by MM. Is it the right of one or all of the other 20 developers (results have been submitted to MM already, so there is no reason that tuning/overfitting can happen) that he gets access to GT data to check if the analyses of the results are reasonable?

The best source for publication ethics known to me is Cope, the Committee on publication Ethics. They have a document for author guidlines (PDF), where it is recommended that an author submits “A declaration that that person takes responsibility for the integrity of the paper”.

This is like always in Ethics only a recommendation, the journal obviously has to do here the final decision.

In my opinion I can try to get access to the GT by writing MM a mail, nevertheless if not granted, I cannot personally assure the integrity of the paper and have to retract my authorship. Or I get in contact with the editor of the journal if MM is not cooperative and let him decide.

Am i right about my conclusions? Are there other strong widely followed ethical guidelines about this issue?

Thanks for further recommondations.