Walt Crawford is a semi-retired library writer, editor, speaker, researcher and systems analyst. The motto for this blog used to be “Libraries, music, net media, cruising, policy, and other stuff not quite ready for Cites & Insights.”
I’m pleased to announce that Gold Open Access by Country 2013-2018 is now available. It includes a profile of each country with at least ten gold OA journals active in 2018, and summary notes on each country with one to nine gold OA journals.
Just for fun, I’ve added a thought experiment: if all journals were OA, what would likely expected costs be–under several scenarios and comparing different sets of countries.
As usual, links to the free PDF and nominally-priced trade paperback ($7, of which I get $0.12) are at https://waltcrawford.name/goaj.html
There will be a third book in a few weeks, beginning with profiles for each subject and adding brief profiles for a range of publishers with large groups of OA journals–specifically including some of the many university publishers. As with the first two parts, it will be available as a free (CC BY) PDF or as a nominally-priced 6×9″ cream paper trade paperback (price set at production cost rounded up to the nearest $0.50).
This report covers 12,150 fully-analyzed journals (out of a universe of 12,415)–and not only did article count finally exceed 600,000, it exceeds 700,000 2018 articles.
As usual, most articles in biomed and STEM involve fees of some sort, while most articles in H&SS (humanities and social sciences) do not–and, as usual, most journals do not have fees.
The incorrect term APC has been replaced by fee (which includes submission fees, processing/publishing fees and required membership dues). The apparently confusing term free has been replaced by no-fee.
Dropped: the visibility metric and the APCLand/OAWorld distinction.
Added: A look at the four iterations of GOA/GOAJ, and a comparison of journals added (or changed such that matching didn’t work) to DOAJ in 2018 with those continuing from GOAJ3.
Changed: A new Key Facts table should make it possible to get a quick picture of the journals and articles, fee and no-fee article percentages, distribution by subject segment, and average article cost for each element–and, for all but the overall picture, how this group (subject, country, publisher category, etc.) compares to the whole dataset. A number of tables have been modified to emphasize article percentages and cost per article. Oh, and the series name has changed to reflect the fact that it always has been about articles in journals.
Gold Open Access by Country 2013-2018 (paperback and free PDF ebook)
Gold Open Access 2013-2018: Subject and Publisher Profiles (paperback, free PDF ebook, and probably all or part of two Cites & Insights issues)
And, to be sure, the May 2019 Cites & Insights, consisting of the first few chapters of GOA4. I’m reconsidering this: it seems like a waste of time and was only read 300 or so times last year. There may or may not be a brief “backgrounder” issue related to GOA4.
“Soon”? A few weeks in each case, depending on other events.
And, for those fond of new colors/minerals/whatever: Yes, of course gold OA includes so-called “platinum” and “diamond” OA.
Readership figures for GOAJ3 (unfortunately missing most of today, 4/30, and the last day of each month)–and the final report on GOAJ2 as well. GOA4 will be out sometime in May, and future reports will include GOA4 and GOAJ3.
As noted in some previous posts, there appeared to be a significant problem with malware-infected OA journals in the Directory of Open Access Journals, found as I was doing the data gathering for Gold Open Access 2013-2018 or GOAJ4.–which I encountered when doing the data-gathering for GOAJ4 or Gold Open Access 2013-2018: Articles in Journals.
This year, I contacted DOAJ–and they acted, with staff contacting affected institutions (almost always universities) and letting them know about the problem.
The results are in, and they’re impressive: retesting problematic cases among the 12,415 journals in DOAJ as of January 1, 2019, I now find all of nineteen malware cases, two of which are “outbound” calls such that, if you’re running good antivirus/antimalware software like Malwarebytes Pro, you can still use the journal site. Those two are from Spain. The other seventeen include four each from Argentina (one university) and Brazil, three from Ecuador, two from Mexico, and one each from Colombia, Peru, Spain and Venezuela. [Added: Minor oops–there’s a third “outbound” case from the U.S.]
SEVENTEEN PLUS TWO–down from more than 200 initially,
It’s also the case that many/most other problematic cases were cleaned up: there are now a total of 100 “XX” cases (unreachable or unworkable, including 404 and other errors).
This is a huge improvement, especially for malware. Consider:
Fourth GOAJ: 100 XX, and 19 XM/malware including two outbound-call cases.
Thanks to Tom, Clara, managing editors and others at DOAJ, and to the responsive contacts in Indonesia and other countries with malware issues.
You can safely assume that, if there’s a GOAJ5 or beyond, I will be staying in touch with DOAJ about data issues. The results this year are, I believe, well worthwhile.
[Yes, the data gathering phase is done. Now for data massaging and report-writing: figure June or possibly late May for the report. I’ll entice you with two big numberx: the report will include more than 12,000 fully-analyzed journals–with more than 700,000 articles in 2018.]
As I work to prepare matrices for Gold Open Access Journals 4, 2013-2018, before starting the second data pass next week (April 1), I find that I’m *definitely* planning some presentation changes, while keeping things as comparable to GOAJ3 as makes sense–but I’m also thinking that some additional changes might make it more useful and used.
The principles behind the planned and possible changes: to increase clarity and to increase the emphasis on articles and costs. That is: you can analyze percentages of no-fee gold OA journals in DOAJ easily enough, directly on DOAJ. What makes my research different is that I look at article counts–and on possible average cost per article for a grouping, which is far more meaningful than average fee per journal.
Change “APC” to “Fee” throughout, and “Free” to “No-fee” (or “No fee” as appropriate). (Note that “Fee” includes submission and processing/publishing fees, and required society memberships if present. That’s always been true for “APC” in these reports.)
Replace the “Journals and articles” table that begins most discussions with a new “Key Facts” table that (a) drops the %Free column for journals (but retains %No-fee for articles), (b) adds a $/article column. (c) expands the table to show figures for each of the three segments as well as overall figures, (d) in most cases (excluding the first/overall occurrence), adds Rel% columns showing how this group’s percentages and $/art compare with overall figures. [Example: For Latin America in 2017, Article fee% is -75% relative to the universe, $/article is down 94%, and so on…]
There will be a chapter offering some comparisons of the four generations of DOAJ data.
Being considered (comments welcome!)
In most tables that currently show no-fee journal percentages as a column, drop that column and add a $/article column.
Comments welcome (for the next two weeks at least).
Example added [3/26/19]
To help visualize and consider these possible [probable] changes, I’ve generated a PDF consisting of three chapters [using the same Word template as last year]:
A “chapter” with a set of sample tables using all 2017 data (except in one case, where both the universe table and a table for Latin America appear)
Chapter 16, Latin America, from GOAJ3, unchanged.
Chapter 16, Latin America, with new tables and figures based on the changes discussed above. [Commentary has not generally been changed, and I haven’t moved tables to save space. If journal counts are different than those in the first Chapter 16–I haven’t checked–it’s because the new matrix is entirely consistent in using overall analyzed journal counts except where annual activity is clearly indicated; in some cases, the new journal count may be higher.]
In the previous post, I added cases of malware and other problematic journals after going through half of the remaining 6,415.
I’ve now completed the first scan–all 12,415, from the first publisher, “Alexandru Ioan Cuza” University of Iași, and journal, Analele Ştiinţifice Ale Universităţii Alexandru Ioan Cuza din Iași,Sectiunea II A : Genetica si Biologie Moleculara — to the last publisher, سازمان جغرافیایی نیروهای مسلح, and journal, اطلاعات جغرافیایی.
In the process, I encountered another 41 malware-infected journals; these have been added to the existing Google Sheet as Group 4. The link in clear text: https://docs.google.com/spreadsheets/d/1GgYqbnw3E-NJPiFZB1GeIJib8q0OGQNhxpRNaA56pyU/edit?usp=sharing
I also encountered another 137 XX and XN cases (XX=unworkable for one reason or another, XN=not an OA journal), which have been integrated into the existing Google sheet, for a total of 429 cases. The link in clear text: https://docs.google.com/spreadsheets/d/1kdID-XiYlL0TogDvAchSbhN5LqvJUjuCrlpED4RUHC8/edit?usp=sharing
I’m hoping that all or most of these (well, probably not the handful of XN) can be fixed before I do the second pass. DOAJ people have been kept in the loop. It’s fair to say that malware-infested journals that don’t get fixed–and OA journals where the site doesn’t work–really don’t belong in DOAJ: they either endanger users or are just useless
Between now and the end of March, there’s lots of non-OA stuff to do (weeding, taxes, washing windows, etc…). I also expect to update template spreadsheets for this year’s book, resolve a terminology issue, and plan a new table that I believe will be especially useful. I may also do a bit of cleanup ork.
Around April 1, I’ll start retesting 1,000+ journals that didn’t have 2018 (or all 2018) articles posted yet, allowing one more chance.
On April 15, I’ll start the second pass on the 800-odd problematic cases, including the malware issues.
Once that’s done, I’ll massage the data (adding derivative columns) and start working on the books that represent the final product. The dataset will be made available at figshare when the books are ready.
A couple of tidbits–and a trivia question
Last year, there were 10,293 fully-analyzed journals in GOAJ3. I was hoping for 12,000+ in GOAJ4. That’s still possible, but only if a lot of problematic journals are cleared up. There will be at least 11,477, and I’d guess at least 11,500–but 12,000 may be a stretch. [In 108 cases, namely XD, the journal can’t be analyzed: except for a very small handful of actual duplicates, these are journals that ceased before 2013 or changed names before 2013, and thus have no data to analyze.]
Last year, the total article count fell a bit short of 600,000. It will definitely be well above 600,000 this year (and, with added-but-not-new journals, will be above 600,000 for 2017). How far above? That depends on the second pass and how many journals get fixed…
Here’s the trivia question: Of 2,791 journals that charged fees in 2017 (and are still in DOAJ), 995 have lower fees this year than last. Since I’m all in favor of low or preferably no fees, I should find this enormously encouraging–but I don’t. Why is that? [If you’ve read GOAJ3, you should be able to figure this out…] No prize; leave answers in the (open) comments or send me email.
If you recall this post, the first part included a link to eight malware-infected journals within the first 3,000 scanned OA journals not from one of four countries, and said I’d probably add to that list when I was halfway through the remainder.
I’m halfway through. Of the 3,200 journals scanned in this group, there were four infected journals last year, including one “BM” (an infection at a lower level such that I could analyze the journal while blocking the infected call). This time around, unfortunately, there are more: 14 in all, including three from Colombia, two each from Bulgaria, Iran, Poland, and Ukraine, and one each from Chile, Serbia and Taiwan.
I’ve added those journals to the existing Google sheet and added a column for “Group,” where “2” is the earlier group and “3” the new group. When I finish the scan, “4” will be the remainder. (“1” is the original list of malware-infected journals issued earlier.) Within groups, they’re sorted by country, then publisher.
Here’s the link in plain text: https://docs.google.com/spreadsheets/d/1GgYqbnw3E-NJPiFZB1GeIJib8q0OGQNhxpRNaA56pyU/edit?usp=sharing
I’m hoping some or all of these can be fixed; I’ll retest the, no earlier than April 15, 2019.
I’m also posting a Google sheet with XN (apparently not OA) and XX (not usable for various reasons) cases for the first 9,200 journals (except for Indonesia, where those were included in the malware sheet).
It includes 293 journals. Many of these are transitory problems, but this is still far too many. “Cod” defines whether the journal is XN (apparently not OA) or XX (other problems). “Note” offers a brief note on what’s going on, where I had one–e.g., error 404 (75 of them), ad (10) and parking (15) pages, “to”–timeout (34), “dns”–unable to resolve URL (60), “db”–database failure (13) and others, including blank pages (7).
I’ll run the first retest, of all XM/XN/XX journals and all journals where more 2017 issues might have been added, on April15 or two weeks after I finish the first scan, whichever comes last. I’ll scan remaining XM a third time after finishing that scan. Then comes the fun part: data crunching and writing the books.
I will update both spreadsheets when the first pass is complete–some time in late March if all goes well. Since there were 17 XM and BM journals last year among those 3,200-odd journals, I anticipate slightly more additions than this time around. (Not surprising: most infections seem to happen at universities, and I’m scanning alphabetically by publisher, so U…:about 1,800 plus all the remaining universities that don’t start with Uni…
Here’s a plain-text link to the second sheet: https://docs.google.com/spreadsheets/d/1kdID-XiYlL0TogDvAchSbhN5LqvJUjuCrlpED4RUHC8/edit?usp=sharing
You might think of this as a really tiny issue of Cites & Insights–and it’s the only one you’ll see in February or, almost certainly, March.
I’m roughly halfway through the initial scan of DOAJ journals for GOAJ4: Gold Open Access Journals 2013-2018 (“roughly”: I’ve done 6,000 and have 6,415 left to do). Seemed like a good point to pause, save off the first 6,000 (so I can save the “done” spreadsheet faster–in a second instead of two or three) and comment on a few things.
Malware So Far, Everybody Else
As noted in previous posts, I deliberately scanned four countries out of order because they had significant numbers of malware-infected journals last year, and I was hoping that DOAJ contacts could help inform the infected publishers (all universities) to fix the problems. Note that the four countries, including two of the most prolific OA publishers, accounted for 3,058 of the journals scanned so far–leaving 2,942 others, or about a third of the others.
The resulting posts and, in some cases, URLs for Google Sheets of the infected journals:
Indonesia, which has a worse malware problem than last year
Malaysia, which seems to have fixed its malware problem entirely
Romania, which has about the same level of infection as last year
Brazil, which has more than last year but still a tiny percentage.
Originally, I said I’d post the remaining cases when I was done with the scan–but I’ve changed that slightly, thinking that some cases could be fixed early. Instead, I’m posting a Google Sheet now that contains all the others in the first 6,000 journals. I’ll add to that sheet when I’m roughly three-quarters of the way done (say around 9,200 journals), probably early March, and add to it again when I’m done with the first pass (with luck, early April). The final scan of infected journals won’t happen until at least April 15 or two weeks after that post, whichever come last.
Here’s the link as plain text: https://docs.google.com/spreadsheets/d/1GgYqbnw3E-NJPiFZB1GeIJib8q0OGQNhxpRNaA56pyU/edit?usp=sharing
Here’s the thing: the spreadsheet includes all of eight (8) journals, one in each of eight countries. There may be more in the remaining 6,415 journals, but so far it’s not a big problem.
Still Avoiding PlanS
In the October 2018 Cites & Insights, one of a group of small essays was titled “One I Do Not Plan to Cover,” explaining why I was ignoring PlanS (at least for the moment)–not even tagging items for later discussion.
Briefly, the reasons were that too much was being written for me to follow; that it’s European and I’m not; that I don’t know that I understand all the issues; that I’ve never published in APC-based OA journals (or in any peer-reviewed journals in quite a while); that I was seeing growing honesty from scholars of a sort I found disheartening; and that there was stuff about “academic freedom”that struck me as remarkable.
All those reasons are still valid, especially the penultimate one–and there’s another one that I predicted to myself and wanted to avoid.
That last one: I suspected the long knives would come out in force, attempting in various ways to undermine serious OA or any attempts to upset the current regime. That was a pretty safe prediction, to be sure…
As to the penultimate one: We’re seeing a lot of the Yabbuts come out: That is, “Oh, I’m all for OA, but...” Pretty much like “Some of my best friends are X, but…”
I suspect that around 705-80% of scholars-who-publish just don’t care (or in some cases know) about OA as something that affects them. They have their access, coming out of the library’s budget (aka Somebody Else’s Problem) and they don’t much care about wider distribution for their scholarship–as long as it gets cited and/or helps them get promoted.
I don’t think there’s anything new about the Yabbuts. I do think they’ve been made aware that something serious might actually happen, making the BUT more important.
There’s also another reason I’m staying away: I haven’t read PlanS, and from what I’ve been unable to avoid hearing about it, I suspect I wouldn’t be wholly in favor (e.g., provisions that effectively make it unfeasible for the thousands of very small academic and society journals with no formal funding) to keep going. And since I haven’t read the thing, I may be wrong…
So I’m staying away.
Really Unfortunate, If True
One final and somewhat blind note. I tagged an article that, if I didn’t misunderstand it in a brief skim, seemed to be seriously suggesting that one retired librarian should determine what articles should be included in review articles, and specifically biomed articles. Oh, not in those words, but arguing that articles in “predatory” journals–defined only by reference to The Lists–shouldn’t be included in review articles.
If I read it correctly, this is appalling. Here’s a counter proposal: no articles in any journal published by a publisher with an article that’s probably caused more lost lives (and recurrence of supposedly-obliterated diseases!) than all articles in Listed journals put together should be considered for reviews. Whoops: There goes 10% of the literature.
And, of course, I don’t really believe all Elsevier journals should be tarred because of one article that The Lancet took a long time–twelve years–to fully retract. That would be like smearing thousands of articles and hundreds of journals because one person thinks one of the journals looks bad, without providing any reason. Which is, of course, the whole thing with The Lists. [You can read a pretty good summation of the killer-article history in Wikipedia.)
Now I see that the Master of the Lists is writing for THE (Tabloid on Higher Education? I may have that wrong). And far too many people still treat the lists as significant.
Enough of this. Back to the journal scan: 9,415 left to go.